"This site requires JavaScript to work correctly"

Prof. Dr.-Ing. Marcus Barkowsky

Research:

  • Quality of Experience
  • Algorithms for Video Quality Analysis
  • Algorithmic modelling of the human visual system
  • Analysis of subjective experiments
  • Perceptual methods in video coding

Vice Dean

Vice dean of the “Applied Computer Science” Faculty Study Coordinator “Interactive Systems and Internet of Things”

DEGG's 2.17

0991/3615-409


consulting time

Friday, 12:45-13:30, meeting appointment by email


Sortierung:
Contribution
  • J. Bialkowski
  • M. Menden
  • Marcus Barkowsky
  • A. Kaup

A Fast H.263 to H.264 Inter-Frame Transcoder with Motion Vector Refinement.

In: Picture Coding Symposium (PCS) 2004. pg. 47-52

San Francisco

  • (2004)
Video transcoding techniques supply interoperability of a great variety of devices that can be connected by various communication networks with different data rate requirements. Particularly inhomogeneous video transcoding is the conversion of an existing video bitstream from one standard into a bitstream of another standard, for example the conversion from H.263 data into H.264 data. It may also include parameter adaptations such as bitrate or frame rate reduction. In this work, we present a low-complexity transcoder design for transcoding Interframe macroblocks from H.263 to H.264. The large complexity reduction comes from reusing motion vectors of the input bitstream and from the fact that only a subset of all possible H.264 coding parameters is used. The selection of these parameters is based on statistical investigations of encoded H.264 parameters from a full parameter search on decoded H.263 sequences. Our approach leads to small rate-distortion losses compared to the full parameter encoder below 0.5 dB at comparable data rates, but the computational complexity reduction is over 98% for finding a suitable macroblock decision. Compared to simply copying the motion vectors without post-processing, the rate-distortion gain of our approach is up to 2 dB at equivalent rate.
Contribution
  • J. Bialkowski
  • Marcus Barkowsky
  • A. Kaup

On Requantization in Intra-Frame Video Transcoding with Different Transform Block Sizes.

In: 2005 IEEE 7th Workshop on Multimedia Signal Processing. pg. 1-4

  • (2005)

DOI: 10.1109/MMSP.2005.248669

Transcoding is a technique to convert one video bit-stream into another. While homogeneous transcoding is done at the same coding standard, inhomogeneous transcoding converts from one standard format to another standard. Inhomogeneous transcoding between MPEG-2, MPEG-4 or H.263 was performed using the same transform. With the standardisation of H.264 also a new transform basis and different block size was defined. For requantization from block size 8times8 to 4times4 this leads to the effect that the quantization error of one coefficient in a block of size 8times8 is distributed over multiple coefficients in blocks of size 4times4. In our work, we analyze the requantization process for inhomogeneous transcoding with different transforms. The deduced equations result in an expression for the correlation of the error contributions from the coefficients of block size 8times8 at each coefficient of block size 4times4. We then compare the mathematical analysis to simulations on real sequences. The reference to the requantization process is the direct quantization of the undistorted signal. It will be shown that the loss is as high as 3 dB PSNR at equivalent step size for input and output bitstream. Also an equation for the choice of the second quantization step size in dependency of the requantization loss is deduced. The model is then extended from the DCT to the integer-based transform as defined in H.264
Contribution
  • Marcus Barkowsky
  • J. Bialkowski
  • A. Kaup

Subjektiver Videobetrachtungstest für niederratige Multimedia-Szenarien.

In: ITG Fachbericht 188: Elektronische Medien 2005. pg. 169-175

VDE-Verlag

  • (2005)
Contribution
  • J. Bialkowski
  • Marcus Barkowsky
  • F. Leschka
  • A. Kaup

Low-Complexity Transcoding of Inter Coded Video Frames from H.264 to H.263.

In: 2006 IEEE International Conference on Image Processing. pg. 837-840

  • (2006)
The presented work addresses the reduction of computational complexity for transcoding of interframes from H.264 to H.263 baseline profiles maintaining the quality of a full search approach. This scenario aims to achieve fast backward compatible interoperability inbetween new and existing video coding platforms, e.g. between DVB-H and UMTS. By exploiting side information of the H.264 input bitstream the encoding complexity of the motion estimation is strongly reduced. Due to the possibility to divide a macroblock (MB) into partitions with different motion vectors (MV), one single MV has to be selected for H.263. It will be shown, that this vector is suboptimal for all sequences, even if all existing MVs of a MB of H.264 are compared as candidate. Also motion vector refinement with a fixed ½-pel refinement window as used by transcoders throughout the literature is not sufficient for scenes with fast movement. We propose an algorithm for selecting a suitable vector candidate from the input bitstream and this MV is then refined using an adaptive window. Using this technique, the complexity is still low at nearly optimum rate-distortion results compared to an exhaustive full-search approach.
Contribution
  • J. Bialkowski
  • Marcus Barkowsky
  • A. Kaup

A new algorithm for reducing the requantization loss in video transcoding.

In: 2006 14th European Signal Processing Conference. pg. 1-5

  • (2006)
Video transcoders are devices that convert one video bitstream into another type of bitstream, either with or without standard format conversion. One step to be applied in video transcoders is the requantization of the transform coefficients, if an adaptation to a lower data rate is necessary. During this step, the quality is in most cases degraded compared to a single quantization. This is a consequence of nonoverlapping quantization characteristics of the input and the output quantizer. In this work we propose a new choice of the reconstruction level for the requantization step depending on the effective quantization curve of both quantization parameters involved. The reconstruction level is calculated such that it is centered in each effective quantization interval after requantization. Compared to the standard midpoint requantization this leads to quality gains of 3 dB PSNR for most pairs of input and output quantization parameters (QP). The algorithm is useful for intra- and inter-frame coding.
Contribution
  • U. Fecker
  • Marcus Barkowsky
  • A. Kaup

Improving the Prediction Efficiency for Multi-View Video Coding Using Histogram Matching.

In: Proceedings of the Picture Coding Symposium 2006. 24-26 April 2006, Beijing, China

  • Eds.:
  • Y. He

[Kenzler Conference Management] [Isernhagen]

  • (2006)
Applications for video data recorded witha setup of several cameras are currently attracting increasinginterest. For such multi-view sequences, efficientcoding is crucial to handle the enormous amountof data. However, significant luminance and chrominancevariations between the different views, which oftenoriginate from imperfect camera calibration, areable to reduce the coding efficiency and the renderingquality. In this paper, we suggest the usage of histogrammatching to compensate these differences in apre-filtering step. After a description of the proposedalgorithm, it is explained how histogram matching canbe applied to multi-view video data. The effect of histogrammatching on the coding performance is evaluatedby statistically analysing prediction from temporalas well as from spatial references. For several testsequences, results are shown which indicate that theamount of spatial prediction across different cameraviews can be increased by applying histogram matching.
Contribution
  • Marcus Barkowsky
  • B. Eskofier
  • J. Bialkowski
  • A. Kaup

Influence of the Presentation Time on Subjective Votings of Coded Still Images.

In: Proceedings of the International Conference on Image Processing. pg. 429-432

IEEE

  • (2006)
The quality of coded images is often assessed by a subjective test. Usually the viewers get as much time as they need to find a stable result. In video sequences however, the viewer has to judge the quality in a shorter time that is defined by the changing content or a following scene cut. Therefore it is desirable to know the influence of a shorter presentation time on the perceptibility of distortions. In this paper we present the results of a suitable subjective test on coded still images. The images were presented for six different durations, ranging from 200 ms to 3 s. Special care was taken to avoid the memorization effect usually present after short presentations. The results show that the viewers tend to avoid extreme votings at short durations. The variance of the votings is also discussed in detail. Based on the result of the voting for the longest presentation time, we propose a prediction model for the voting of the shorter durations using a logistic curve fit. This presentation time model (PTM) is presented and analysed in detail.
Contribution
  • J. Bialkowski
  • Marcus Barkowsky
  • A. Kaup

Overview of Low-Complexity Video Transcoding from H.263 to H.264.

In: 2006 IEEE International Conference on Multimedia and Expo. pg. 49-52

  • (2006)
With the standardization of H.264/AVC by ITU-T and ISO/IEC and the adaptatation into new hardware, the necessity of transcoding between existing standards and H.264 will arise to achieve interoperability between hardware devices. Because of the many new prediction parameters as well as the pixel-based deblocking filter and the new transform of H.264 this is a difficult task to perform. In our work we propose a fast cascaded pixel-domain transcoder from H.263 to H.264 for both intra- and inter-frame coding. The rate-distortion (RD) performance of the encoded bitstreams is compared to an exhaustive full-search approach. Our approach leads to 9% higher data rate in average, but the computational complexity for the prediction can be reduced by 90% and more. It will be shown that the algorithms proposed for H.263 are applicable for transcoding MPEG-2 to H.264, too
Contribution
  • Marcus Barkowsky
  • R. Bitto
  • J. Bialkowski
  • A. Kaup
  • B. Li

Comparison of matching strategies for temporal frame registration in the perceptual evaluation of video quality.

In: Proceedings of the 2nd International Workshop on Video Processing and Quality Metrics for Consumer Electronics.

  • (2006)
In this paper we compare the performance of different full-frameand block-based algorithms for the temporal alignment of twovideo sequences. The setup is typical for full reference videoquality estimation in a low-bitrate scenario. Lossless and lossydigital transmission scenarios are combined with different distortionsusually found in playback devices. The results show thatthe choice of the algorithm for temporal registration depends verymuch on the type of additional distortion expected.
Contribution
  • Marcus Barkowsky
  • J. Bialkowski
  • R. Bitto
  • A. Kaup

Temporal Registration using 3D Phase Correlation and a Maximum Likelihood Approach in the Perceptual Evaluation of Video Quality.

In: Proceedings of the IEEE International Workshop on Multimedia Signal Processing. pg. 195-198

  • (2007)
The estimation of the video quality is often performed using a full reference approach. One of the most important steps in a video quality measurement algorithm is to find the corresponding frames between the reference and the distorted video sequence. In this paper an algorithm with three steps is proposed. First, an extended version of the phase correlation is used to find candidate images with an arbitrary temporal offset, spatial scaling or spatial shift. Based on the assumption that the spatial scaling and spatial shift does not change during the sequence a set of probable parameters is selected. Finally, a maximum likelihood estimation is applied to select those temporal offsets which support the smoothest playback. A set of video sequences degraded with several distortions which are typical for multimedia scenarios are used to compare the performance to other algorithms.
Contribution
  • Marcus Barkowsky
  • B. Eskofier
  • R. Bitto
  • J. Bialkowski
  • A. Kaup

Perceptually motivated spatial and temporal integration of pixel based video quality measures.

In: Mobile Content Quality of Experience 2007 (MobConQoE '07): Fourth International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness. pg. 1-7

Association for Computer Machinery, ACM

  • (2007)
In the evaluation of video quality often a full reference approach is used, thus calculating some measure of difference between the reference frames and the distorted frames. Often this measure returns one value per pixel, in the simplest case the squared difference. Conventionally, this pixel based measure is averaged over space and time. This paper introduces a psychophysically derived algorithm for this step. It uses the distribution of the cells in the fovea and the assumption that in a subjective test the part with the highest distortion is most important. Additionally, a temporal integration step is proposed which models the recency and forgiveness effect. Different video quality measures are enhanced with these two steps and their performance is evaluated using the results of a subjective test.
Contribution
  • U. Fecker
  • Marcus Barkowsky
  • A. Kaup

Time-Constant Histogram Matching for Luminance and Chrominance Compensation of Multi-View Video Sequences.

In: Picture Coding Symposium (PCS).

Lisbon, Portugal

  • (2007)
Significant advances have recently been made in the coding ofvideo data recorded with multiple cameras. However, lumi-nance and chrominance variations between the camera viewsmay deteriorate the performance of multi-view video codecsand renderers. In this paper, the usage of time-constant his-togram matching is proposed to compensate these differencesin a pre-filtering step. It is shown that the usage of histogrammatching prior to multi-view video coding leads to significantgains for the coding efficiency of both the luminance and thechrominance components. Histogram matching can also beuseful for image-based rendering to avoid incorrect illumina-tion and colour reproduction resulting from miscalibrations inthe recording setup. It can be shown that the algorithm is fur-ther improved by additionally using RGB colour conversion.
Journal article
  • J. Bialkowski
  • Marcus Barkowsky
  • A. Kaup

Fast Video Transcoding from H.263 To H. 264/AVC.

In: Multimedia Tools and Applications vol. 35 pg. 127-146

Springer

  • (2007)

DOI: 10.1007/s11042-007-0126-7

In the past 10 years detailed works on different video transcoders have been published. However, the new ITU-T Recommendation H.264—also adapted as ISO/IEC MPEG-4 Part 10 (AVC)—provides many new encoding options for the prediction processes that lead to difficulties for low complexity transcoding. In this work we present very fast transcoding techniques to convert H.263 bitstreams into H.264/AVC bitstreams. We will give reasoning, why the proposed pixel domain approach is advantageous in this scenario instead of using a DCT domain transcoder. Our approach results in less than 9% higher data rate at equivalent PSNR quality compared to a full-search approach. But this rate loss allows the reduction of the search complexity by a factor of over 200 for inter frames and still a reduction of over 70% for intra frames. A comparison to a fast search algorithm is given. We also provide simulation results that our algorithm works for transcoding MPEG-2 to H.264/AVC in the aimed scenario.
Journal article
  • U. Fecker
  • Marcus Barkowsky
  • A. Kaup

Histogram-Based Prefiltering for Luminance and Chrominance Compensation of Multiview Video.

In: IEEE Transactions on Circuits and Systems for Video Technology vol. 18 pg. 1258-1267

  • (2008)
Significant advances have recently been made in the coding of video data recorded with multiple cameras. However, luminance and chrominance variations between the camera views may deteriorate the performance of multiview codecs and image-based rendering algorithms. A histogram matching algorithm can be applied to efficiently compensate for these differences in a prefiltering step. A mapping function is derived which adapts the cumulative histogram of a distorted sequence to the cumulative histogram of a reference sequence. If all camera views of a multiview sequence are adapted to a common reference using histogram matching, the spatial prediction across camera views is improved. The basic algorithm is extended in three ways: a time-constant calculation of the mapping function, RGB color conversion, and the use of global disparity compensation. The best coding results are achieved when time-constant histogram calculation and RGB color conversion are combined. In this case, the usage of histogram matching prior to multiview encoding leads to substantial gains in the coding efficiency of up to 0.7 dB for the luminance component and up to 1.9 dB for the chrominance components. This prefiltering step can be combined with block-based illumination compensation techniques that modify the coder and decoder themselves, especially with the approach implemented in the multiview reference software of the joint video team (JVT). Additional coding gains up to 0.4 dB can be observed when both methods are combined.
Journal article
  • Marcus Barkowsky
  • J. Bialkowski
  • B. Eskofier
  • R. Bitto
  • A. Kaup

Temporal Trajectory Aware Video Quality Measure.

In: IEEE Journal of Selected Topics in Signal Processing vol. 3 pg. 266-279

IEEE

  • (2009)
The measurement of video quality for lossy and low-bitrate network transmissions is a challenging topic. Especially, the temporal artifacts which are introduced by video transmission systems and their effects on the viewer's satisfaction have to be addressed. This paper focuses on a framework that adds a temporal distortion awareness to typical video quality measurement algorithms. A motion estimation is used to track image areas over time. Based on the motion vectors and the motion prediction error, the appearance of new image areas and the display time of objects is evaluated. Additionally, degradations which stick to moving objects can be judged more exactly. An implementation of this framework for multimedia sequences, e.g., QCIF, CIF, or VGA resolution, is presented in detail. It shows that the processing steps and the signal representations that are generated by the algorithm follow the reasoning of a human observer in a subjective experiment. The improvements that can be achieved with the newly proposed algorithm are demonstrated using the results of the Multimedia Phase I database of the Video Quality Experts Group.
Contribution
  • Marcus Barkowsky
  • R. Cousseau
  • P. Le Callet

Influence of Depth Rendering on the Quality of Experience for an Autostereoscopic Display.

In: International Workshop on Quality of Multimedia Experience (QoMEx).

  • (2009)
Autostereoscopic displays simplify the presentation of 3D content because they do not require any glasses and they allow the perception of motion parallax. While the perception of depth is certainly an added value, the technical rendering process of the current display technology also introduces artifacts. For the viewer, the tradeoff may be expressed in terms of quality of experience. However, quality of experience assessment related to 3D is still an open issue. Towards this goal, several original subjective test methods are proposed and compared that are meant for assessing the quality of experience. A split-screen setup simultaneously displays a 2D and a 3D presentation. The observers vote according to their preference in terms of quality of experience. In four experiments, the influence of the depth rendering process is evaluated. The results indicate that the degradation by the depth rendering process may easily dominate the added value of depth in a content specific manner.
Thesis
  • Marcus Barkowsky

Subjective and Objective Video Quality Measurement in Low-Bitrate Multimedia Scenarios.

Friedrich-Alexander-Universität Erlangen-Nürnberg Verlag Dr. Hut München Erlangen

  • 2009 (2009)
In recent years, many distribution channels for low-bitrate video transmissions were setup. The parameter settings for the encoder, the transmission channel, the decoder and the playback device are manifold. In order to maintain customer satisfaction, it is necessary to carefully select and continuously tune those parameters and to monitor the resulting video quality at the receiver. This thesis considers the quality measurement by a human observer and by an automated algorithm. In the first part of the thesis, several subjective tests are performed in order to draw conclusions about the choice of transmission parameters. The experience gained from those experiments led to three psychophysical experiments that focus on isolated aspects of the video quality in lossless or lossy low-bitrate transmissions. Three distinct algorithms are deduced from the subjective experiments which deal with the temporal aspects. First, the visibility of artifacts is modeled when the viewer only has a short period of time for the examination. Second, the influence of transmission outages is modeled: The video playback may pause and content may be skipped if retransmission is not possible. Third, the visual degradation introduced by a reduction of the frame rate is modeled. The second part of the thesis is dedicated to the objective measurement. It is assumed that the reference video sequence is available for comparison with the degraded sequence. Because the performance of the automated measurement depends strongly on the correct alignment of the degraded signal to the reference signal, various algorithms are reviewed, enhanced, and compared that locate the corresponding reference frame for a given degraded frame. So far, many algorithms have been published that reliably predict the visual quality of still images or temporally undistorted video sequences. In this thesis, a new framework is presented that allows to evaluate the performance of these algorithms for temporally distorted video transmissions. The processing steps and the signal representations follow the reasoning of a human observer in a subjective experiment as observed in the first part of the thesis. The improvements that can be achieved with the newly proposed framework are demonstrated by comparing the objective scores with the subjective results of the comprehensive Multimedia Phase I database of the Video Quality Experts Group.
Contribution
  • Y. Pitrey
  • Marcus Barkowsky
  • P. Le Callet
  • R. Pépion

Subjective Quality Assessment Of MPEG-4 Scalable Video Coding In a Mobile Scenario.

In: European Workshop on Visual Information Processing (EUVIP) 2010. pg. 72

Paris, France

  • (2010)
Scalable Video Coding provides several levels of video encapsulated in a single video stream. In a transmission scenario such as broadcasting, this structure is quite advantageous as it can be used to address heterogeneous decoding targets with variable needs and requirements. However, this adaptability comes at a slight cost in coding efficiency when compared to single-layer coding. Based on subjective experiments, this cost is evaluated in this paper by comparing the new MPEG-4 Scalable Video Coding (SVC) standard with the now-established MPEG-4 AVC standard. Two scenarios are analyzed in the context of mobile transmission applications. The first scenario uses the same bitrate for SVC and AVC, leading to a slightly lower PSNR for SVC. The second scenario uses the same PSNR for SVC and AVC, leading to a slightly lower bitrate for AVC. The results of the subjective tests illustrate several interesting aspects of the relation between the performance of the two standards. First, we observe that the offset between AVC and SVC is not severe, though statistically significant in terms of user Mean Opinion Score (MOS) in such a context. Second, while adding another layer to SVC always leads to a performance loss, the impact of this loss decreases when the number of layers increases.
Contribution
  • Marcus Barkowsky
  • P. Le Callet

The influence of autostereoscopic 3D displays on subsequent task performance.

In: Proceedings of SPIE Vol. 7524: Stereoscopic Displays and Applications XXI. pg. 7524-7534

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2010)
Viewing 3D content on an autostereoscopic is an exciting experience. This is partly due to the fact that the 3D effect is seen without glasses. Nevertheless, it is an unnatural condition for the eyes as the depth effect is created by the disparity of the left and the right view on a flat screen instead of having a real object at the corresponding location. Thus, it may be more tiring to watch 3D than 2D. This question is investigated in this contribution by a subjective experiment. A search task experiment is conducted and the behavior of the participants is recorded with an eyetracker. Several indicators both for low level perception as well as for the task performance itself are evaluated. In addition two optometric tests are performed. A verification session with conventional 2D viewing is included. The results are discussed in detail and it can be concluded that the 3D viewing does not have a negative impact on the task performance used in the experiment.
Contribution
  • W. Chen
  • J. Fournier
  • Marcus Barkowsky
  • P. Le Callet

New Requirements of Subjective Video Quality Assessment Methodologies for 3DTV.

In: Fifth International Workshop on Video Processing and Quality Metrics (VPQM) [Scottsdale, AZ, USA].

  • (2010)
In this paper, the new challenges of 3DTV for subjective assessment are discussed. Conventional 2D methods have severe limitations which will be revealed. Based on the understanding of the new characteristics brought by 3DTV, changes and additions in the requirements for subjective assessment are proposed in order to develop a novel subjective video quality assessment methodology for 3DTV. In particular, depth rendering for 3D display is selected to give a further discussion. The depth rendering abilities are defined as a combination of the physical parameters and the perceptual constrains. We analyze different types of stereoscopic and multiview displays. Several problems regarding depth rendering are discussed in order to highlight the diversity and complexity of assessing 3DTV.
Contribution
  • Marcus Barkowsky
  • K. Wang
  • R. Cousseau
  • K. Brunnström
  • R. Olsson
  • P. Le Callet

Subjective Quality Assessment of Error Concealment Strategies for 3DTV in the presence of asymmetric Transmission Errors.

In: 2010 18th International Packet Video Workshop.

  • (2010)
The transmission of 3DTV sequences over packet based networks may result in degradations of the video quality due to packet loss. In the conventional 2D case, several different strategies are known for extrapolating the missing information and thus concealing the error. In 3D however, the residual error after concealment of one view might leads to binocular rivalry with the correctly received second view. In this paper, three simple alternatives are presented: frame freezing, a reduced playback speed, and displaying only a single view for both eyes, thus effectively switching to 2D presentation. In a subjective experiment the performance in terms of quality of experience of the three methods is evaluated for different packet loss scenarios. Error-free encoded videos at different bit rates have been included as anchor conditions. The subjective experiment method contains special precautions for measuring the Quality of Experience (QoE) for 3D content and also contains an indicator for visual discomfort. The results indicate that switching to 2D is currently the best choice but difficulties with visual discomfort should be expected even for this method.
Contribution
  • Marcus Barkowsky
  • P. Campisi
  • P. Le Callet
  • V. Rizzo

Crosstalk measurement and mitigation for autostereoscopic displays.

In: Proceedings of SPIE Vol. 7526: Three-Dimensional Image Processing (3DIP) and Applications. pg. 7526-7531

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2010)
In this paper we address the problem of crosstalk reduction for autostereoscopic displays. Crosstalk refers to the perception of one or more unwanted views in addition to the desired one. Specifically, the proposed approach consists of three different stages: a crosstalk measurement stage, where the crosstalk is modeled, a filter design stage, based on the results obtained out of the measurements, to mitigate the crosstalk effect, and a validation test carried out by means of subjective measurements performed in a controlled environment as recommended in ITU BT 500-11. Our analysis, synthesis, and subjective experiments are performed on the Alioscopy® display, which is a lenticular multiview display.
Contribution
  • U. Engelke
  • Marcus Barkowsky
  • P. Le Callet
  • H.-J. Zepernick

Modelling Saliency Awareness for Objective Video Quality Assessment.

In: International Workshop on Quality of Multimedia Experience (QoMEX) [June 2010; Trondheim, Norway].

  • (2010)
Contribution
  • Marcus Barkowsky
  • P. Le Callet

On the Perceptual Similarity of Realistic Looking Tone Mapped High Dynamic Range Images.

In: 2010 IEEE International Conference on Image Processing.

  • (2010)
High Dynamic Range (HDR) images are usually displayed on conventional Low Dynamic Range (LDR) displays because of the limited availability of HDR displays. For the conversion of the large dynamic luminance range into the eight bit quantized values, parameterized Tone Mapping Operators (TMO) are applied. Human observers are able to optimize the parameters in order to get the highest Quality of Experience by judging the displayed LDR images on a realism scale. In the study presented in this paper, two TMOs with three parameters each were evaluated by observers in a subjective experiment. Although the chosen parameter settings vary largely, the chosen images appear to have the same QoE for the observers. In order to assess this similarity objectively, three commonly used image quality measurement algorithms were applied. Their agreement with the preference of the observers was analyzed and it was found that the Visual Difference Predictor (VDP) outperforms the Structural Similarity Index and the Root Mean Square Error. A threshold value for VDP is derived that indicates when two LDR images appear to have the same Quality of Experience.
Contribution
  • Q. Huynh-Thu
  • Marcus Barkowsky
  • P. Le Callet

Video Quality Assessment: From 2D to 3D ‐ Challenges and Future Trends.

In: 2010 IEEE International Conference on Image Processing.

  • (2010)
Three-dimensional (3D) video is gaining a strong momentum both in the cinema and broadcasting industries as it is seen as a technology that will extensively enhance the user's visual experience. One of the major concerns for the wide adoption of such technology is the ability to provide sufficient visual quality, especially if 3D video is to be transmitted over a limited bandwidth for home viewing (i.e. 3DTV). Means to measure perceptual video quality in an accurate and practical way is therefore of highest importance for content providers, service providers, and display manufacturers. This paper discusses recent advances in video quality assessment and the challenges foreseen for 3D video. Both subjective and objective aspects are examined. An outline of ongoing efforts in standards-related bodies is also provided.
Contribution
  • U. Engelke
  • Marcus Barkowsky
  • P. Le Callet
  • H.-J. Zepernick

Modelling Saliency Awareness for Objective Video Quality Assessment.

In: IEEE International Workshop on Quality of Multimedia Experience (QoMEX) 2010.

Trondheim, Norway

  • (2010)
Existing video quality metrics do usually not take into consideration that spatial regions in video frames are of varying saliency and thus, differently attract the viewer\textquoterights attention. This paper proposes a model of saliency awareness to complement existing video quality metrics, with the aim to improve the agreement of objectively predicted quality with subjectively rated quality. For this purpose, we conducted a subjective experiment in which human observers rated the annoyance of videos with transmission distortions appearing either in a salient region or in a non-salient region. The mean opinion scores confirm that distortions in salient regions are perceived much more annoying. It is shown that application of the saliency awareness model to two video quality metrics considerably improves their quality prediction performance.
Contribution
  • Y. Pitrey
  • Marcus Barkowsky
  • P. Le Callet
  • R. Pépion

Subjective Quality Evaluation of H.264 High-Definition Video Coding versus Spatial Up-Scaling and Interlacing.

In: QoE for Multimedia Content Sharing.

Tampere, Finlande

  • (2010)
The upcoming High-De nition format for video display provides high-quality content, especially when displayed on adapted devices. When combined with video coding techniques such as MPEG-4 AVC/H.264, the transmission of High-De nition video content on broadcast networks becomes possible. Nonetheless, transmitting and decoding such video content is a real challenge. Therefore, intermediate formats based on lower frame resolutions or interlaced coding are still provided to address targets with limited resources. Using these formats, the nal video quality depends on the postprocessing tools employed at the receiver to upsample and de-interlace these streams. In this paper, we compare the full-HD format to three possible scenarios to generate a full-HD stream from intermediate formats. We present the results of subjective tests that compare the visual quality of each scenario when using the same bitrate. The results show that using the same bitrate, the videos generated from lower-resolution formats reach similar quality compared to the full-HD videos.
Contribution
  • Marcus Barkowsky
  • M. Pinson
  • R. Pépion
  • P. Le Callet

Analysis of Freely Available Dataset for HDTV including Coding and Transmission Distortions.

In: Fifth International Workshop on Video Processing and Quality Metrics (VPQM) [Scottsdale, AZ, USA].

  • (2010)
We present the design, preparation, and analysis of a subjective experiment on typical HDTV sequences and scenarios. This experiment follows the guidelines of ITU and VQEG in order to obtain reproducible results. The careful selection of content and distortions extend over a wide and realistic range of typical transmission scenarios. Detailed statistical analysis provides important insight into the relationship between technical parameters of encoding, transmission and decoding and subjectively perceived video quality.
Contribution
  • Y. Pitrey
  • Marcus Barkowsky
  • P. Le Callet
  • R. Pépion

Evaluation of MPEG4-SVC for QoE Protection in the Context of Transmission Errors.

In: Proceedings of SPIE Vol. 7798: Applications of Digital Image Processing XXXIII.

  • Eds.:
  • SPIE - The International Society for Optical Engineering

San Diego, CA, USA

  • (2010)
Scalable Video Coding (SVC) provides a way to encapsulate several video layers with increasing quality and resolution in a single bitstream. Thus it is particularly adapted to address heterogeneous networks and a wide variety of decoding devices. In this paper, we evaluate the interest of SVC in a different context, which is error concealment after transmission on networks subject to packet loss. The encoded scalable video streams contain two layers with different spatial and temporal resolutions designed for mobile video communications with medium size and average to low bitrates. The main idea is to use the base layer to conceal errors in the higher layers if they are corrupted or lost. The base layer is first upscaled either spatially or temporally to reach the same resolution as the layer to conceal. Two error-concealment techniques using the base layer are then proposed for the MPEG-4 SVC standard, involving frame-level concealment and pixel-level concealment. These techniques are compared to the upscaled base layer as well as to a classical single-layer MPEG- 4 AVC/H.264 error-concealment technique. The comparison is carried out through a subjective experiment, in order to evaluate the Quality-of-Experience of the proposed techniques. We study several scenarios involving various bitrates and resolutions for the base layer of the SVC streams. The results show that SVC-based error concealment can provide significantly higher visual quality than single-layer-based techniques. Moreover, we demonstrate that the resolution and bitrate of the base layer have a strong impact on the perceived quality of the concealment.
Contribution
  • J. Li
  • Marcus Barkowsky
  • Wang, J.
  • P. Le Callet

Study on Visual Discomfort Induced by Stimulus Movement at Fixed Depth on Stereoscopic Displays using Shutter Glasses.

In: 2011 17th International Conference on Digital Signal Processing (DSP).

  • (2011)
Stereoscopic motion images are able to provide observers with realistic and immersive viewing experience. However, observers often experience visual discomfort during the viewing process. In this paper, we investigated the relationship between visual discomfort and the planar motion at different depth levels. The Paired Comparison method was used in the subjective experiments to allow for a precise measurement. The experimental results indicated that the relative angular disparity between foreground object and background played a more important role in determining the visual discomfort than the vergence-accommodation conflict. Furthermore, the results showed that with the increase of planar motion velocity, viewers might experience more visual discomfort. To quantify the effects of relative angular disparity and velocity on visual discomfort, two visual discomfort models were constructed. The preferred model was chosen based on the performance as well as the algorithmic complexity. This model can be used as an index for other related researches.
Contribution
  • Y. Pitrey
  • U. Engelke
  • P. Le Callet
  • Marcus Barkowsky
  • R. Pépion

Subjective Quality of SVC-coded Videos with different Error-Patterns concealed using Spatial Scalability.

In: Proceedings of the 3rd European Workshop on Visual Information Processing (EUVIP) 2011. pg. 67

Paris, France

  • (2011)
Degradation of network performance during video transmission may lead to disturbing visual artifacts. Some packets might be lost, corrupted or delayed, making it impossible to properly decode the video data on time at the receiver. The quality of the error-concealment technique, as well as the spatial and temporal position of the artifacts have a large impact on the perceived quality after decoding. In this paper, we use the spatial scalability feature of Scalable Video Coding (SVC) for error-concealment. This enables the transmission of a lower resolution video with a higher robustness, for example using unequal error protection. Under the assumption that only the higher resolution video would be affected, we evaluated the visual impact of packet losses in a large scale subjective video quality experiment using the Absolute Category Rating method. The number of impairments, the duration, and the interval between impairments as well the quality of the encoded lower resolution video are varied in a systematic evaluation. This allows for analyzing the influence of each factor both independently and jointly.
Contribution
  • W. Chen
  • J. Fournier
  • Marcus Barkowsky
  • P. Le Callet

New stereoscopic video shooting rule based on stereoscopic distortion parameters and comfortable viewing zone.

In: Proceedings of SPIE Vol. 7863: Stereoscopic Displays and Applications XXII;. pg. 78631O

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2011)
Human binocular depth perception, the most important element brought by 3DTV, is proved to be closely connected to not only the content acquisition (camera focal length, camera baseline and etc.) but also the viewing environment (viewing distance, screen size and etc.). Conventional 3D stereography rule in the literature usually consider the general viewing condition and basic human factors to guide the content acquisition, such as assuming human inter-pupil baseline as the maximum disparity. A lot of new elements or problems of stereoscopic viewing was not considered or precisely defined so that advanced shooting rule is needed to guarantee the overall quality of stereoscopic video. In this paper, we proposed a new stereoscopic video shooting rule considering two most important issues in 3DTV: stereoscopic distortion and comfortable viewing zone. Firstly, a mathematic model mapping the camera space to visualization space is established in order to geometrically estimate the stereoscopic depth distortion. Depth and shape distortion factors are defined and used to describe the stereoscopic distortion. Secondly, comfortable viewing zone (or Depth of focus) is considered to reduce the problem of visual discomfort and visual fatigue. The new shooting rule is to optimize the camera parameters (focal length, camera baseline and etc.) in order to control depth and shape distortion and also guarantee that the perceived scene is located in comfortable viewing zone as possible. However, in some scenarios, the above two conditions cannot be fulfill simultaneously, even sometimes contradict with each other so that priority should be decided. In this paper, experimental stereoscopic synthetic content generation with various sets of camera parameters and various sets of scenes representing different depth range are presented. Justification of the proposed new shooting rule is based on 3D concepts (depth rendering, visual comfort and visual experience) subjective video assessment. The results of this study will provide a new method to propose camera parameters based on management of new criteria (shape distortion and depth of focus) in order to produce optimized stereoscopic images and videos.
Journal article
  • Q. Huynh-Thu
  • Marcus Barkowsky
  • P. Le Callet

The Importance of Visual Attention in Improving the 3D-TV Viewing Experience: Overview and New Perspectives.

In: IEEE Transactions on Broadcasting vol. 57 pg. 421-431

  • (2011)
Three-dimensional video content has attracted much attention in both the cinema and television industries, because 3D is considered to be the next key feature that can significantly enhance the visual experience of viewers. However, one of the major challenges is the difficulty in providing high quality images that are comfortable to view and that also meet signal transmission requirements over a limited bandwidth for display on television screens. The different processing steps that are necessary in a 3D-TV delivery chain can all introduce artifacts that may create problems in terms of human visual perception. In this paper, we highlight the importance of considering 3D visual attention when addressing 3D human factors issues. We provide a review of the field of 3D visual attention, discuss the challenges in both the understanding and modeling of 3D visual attention, and provide guidance to researchers in this field. Finally, we identify perceptual issues generated during the various steps in a typical 3D-TV broadcasting delivery chain, review them and explain how consideration of 3D visual attention modeling can help improve the overall 3D viewing experience.
Contribution
  • Marcus Barkowsky
  • R. Cousseau
  • P. Le Callet

Is visual fatigue changing the perceived depth accuracy on an autostereoscopic display?.

In: Proceedings of SPIE Vol. 7863: Stereoscopic Displays and Applications XXII. pg. 78631V

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2011)
In this paper, a subjective study is presented which aims to measure the minimum perceivable depth difference on an autostereoscopic display in order to provide an indication for visual fatigue. The developed experimental setup was used to compare the subject's performance before and after 3D excitation on an autostereoscopic display. By comparing the results to a verification session with 2D excitation, the effect of 3D visual fatigue can be isolated. It was seen that it is possible to reach the threshold of acuity for stereo disparity on that autostereoscopic display. It was also found that the measured depth acuity is slightly higher after 3D viewing than after 2D viewing.
Contribution
  • N. Staelens
  • I. Sedano
  • Marcus Barkowsky
  • L. Janowski
  • K. Brunnström
  • P. Le Callet

Standardized Toolchain And Model Development For Video Quality Assessment ‐ The Mission Of The Joint Effort Group In Vqeg.

In: Proceedings of 2011 Third International Workshop on Quality of Multimedia Experience (QoMEX). pg. 61

Mechelen, Belgique

  • (2011)
Since 1997, the Video Quality Experts Group (VQEG) has been active in the field of subjective and objective video quality assessment. The group has validated competitive quality metrics throughout several projects. Each of these projects requires mandatory actions such as creating a testplan and obtaining databases consisting of degraded video sequences with corresponding subjective quality ratings. Recently, VQEG started a new open initiative, the Joint Effort Group (JEG), for encouraging joint collaboration on all mandatory actions needed to validate video quality metrics. Within the JEG, effort is made to advance the field of both subjective and objective video quality measurement by providing proper software tools and subjective databases to the community. One of the subprojects of the JEG is the joint development of a hybrid H.264/AVC objective quality metric. In this paper, we introduce the JEG and provide an overview of the different ongoing activities within this newly started group.
Contribution
  • K. Wang
  • Marcus Barkowsky
  • R. Cousseau
  • K. Brunnström
  • R. Olsson
  • P. Le Callet
  • M. Sjöström

Subjective evaluation of HDTV stereoscopic videos in IPTV scenarios using absolute category rating.

In: Proceedings of SPIE Vol. 7863: Stereoscopic Displays and Applications XXII. pg. 78631T

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2011)
Broadcasting of high definition (HD) stereobased 3D (S3D) TV are planned, or has already begun, in Europe, the US, and Japan. Specific data processing operations such as compression and temporal and spatial resampling are commonly used tools for saving network bandwidth when IPTV is the distribution form, as this results in more efficient recording and transmission of 3DTV signals, however at the same time it inevitably brings quality degradations to the processed video. This paper investigated observers quality judgments of state of the art video coding schemes (simulcast H.264/AVC or H.264/MVC), with or without added temporal and spatial resolution reduction of S3D videos, by subjective experiments using the Absolute Category Rating method (ACR) method. The results showed that a certain spatial resolution reduction working together with high quality video compressing was the most bandwidth efficient way of processing video data when the required video quality is to be judged as "good" quality. As the subjective experiment was performed in two different laboratories in two different countries in parallel, a detailed analysis of the interlab differences was performed.
Contribution
  • Y. Pitrey
  • U. Engelke
  • Marcus Barkowsky
  • R. Pépion
  • P. Le Callet

Aligning Subjective Tests using a Low Cost Common Set.

In: QoE for Multimedia Content Sharing.

Lisbon, Portugal

  • (2011)
In this paper we use a common set between three subjective tests to build a linear mapping of the results of two tests onto the scale of one test identi ed as the reference test. We present our low-cost approach for the design of the common set and discuss the choice of the reference test. The mapping is then used to merge the outcomes of the three tests and provide an interesting comparison of the impact of coding artifacts, transmission errors and error-concealment in the context of Scalable Video Coding.
Contribution
  • P. Lebreton
  • A. Raake
  • Marcus Barkowsky
  • P. Le Callet

A Subjective Evaluation of 3D IPTV Broadcasting Implementations Considering Coding and Transmission Degradation.

In: IEEE International Symposium on Multimedia (ISM). pg. 506-511

  • (2011)
This paper describes the results of a subjective test to assess current technology used for 3DTV broadcasting. As a first aspect, the performance of the currently deployed coding schemes was compared to state of the art algorithms. Our results show that down sampling and packing 3D stereoscopic videos according to the so called Side-By-Side format gives the highest perceived quality for a given bit rate. The second aspect of the study was to investigate how common 2D error concealment algorithms perform in case of 3D, and how their 3D-related performance compares with the 2D case. The results provide information on whether binocular suppression or binocular rivalries play the most important role for 3D video quality under transmission error. The results indicate that binocular rivalries and related visual discomfort are the dominant factors. Another aspect of the paper is a comparison of the test results with results from different labs to evaluate the repeatability of a subjective experiment in the 3D case, and to compare the employed test methodologies. Here, the study shows the variation between observers when they are rating visual discomfort and illustrates the difficulty to evaluate this new dimension.
Contribution
  • Wang, J.
  • Marcus Barkowsky
  • V. Ricordel
  • P. Le Callet

Quantifying how the combination of blur and disparity affects the perceived depth.

In: Proceedings of SPIE Vol. 7865: Human Vision and Electronic Imaging XVI. pg. 78650K

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2011)
Contribution
  • J. Li
  • Marcus Barkowsky
  • P. Le Callet

Visual Discomfort Induced by Relative Disparity and Planar Motion of Stereoscopic Images.

In: first Sino French Workshop on Information and Communication Technologies. pg. 1-2

Nantes, France

  • (2011)
Viewers often complain of visual discomfort or visual fatigue after viewing the stereoscopic images. In this paper, we investigated the effects of planar motion at different depth levels on visual discomfort. In the subjective experiments, the Paired Comparison method was used to allow for a precise measurement. The Bradley-Terry model was used to analyze the subjective experimental data. The experimental results indicated that the relative angular disparity between foreground object and background played a more important role in determining the visual discomfort than the vergence-accommodation conflict. Furthermore, viewers might experience more visual discomfort with the increase of planar motion velocity. In a practical application of our study, it may be concluded that for stereoscopic motion images, the depth range for fast motion sequences should be significantly reduced and for slow motion sequences, the depth range may be increased.
Contribution
  • Wang, J.
  • Marcus Barkowsky
  • V. Ricordel
  • P. Le Callet

Quantifying how the Combination of Blur and Disparity affects the Perceived Depth.

In: Proceedings of SPIE Vol. 7865: Human Vision and Electronic Imaging XVI. pg. 78650K

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2011)
The influence of a monocular depth cue, blur, on the apparent depth of stereoscopic scenes will be studied in this paper. When 3D images are shown on a planar stereoscopic display, binocular disparity becomes a pre-eminent depth cue. But it induces simultaneously the conflict between accommodation and vergence, which is often considered as a main reason for visual discomfort. If we limit this visual discomfort by decreasing the disparity, the apparent depth also decreases. We propose to decrease the (binocular) disparity of 3D presentations, and to reinforce (monocular) cues to compensate the loss of perceived depth and keep an unaltered apparent depth. We conducted a subjective experiment using a two- alternative forced choice task. Observers were required to identify the larger perceived depth in a pair of 3D images with/without blur. By fitting the result to a psychometric function, we obtained points of subjective equality in terms of disparity. We found that when blur is added to the background of the image, the viewer can perceive larger depth comparing to the images without any blur in the background. The increase of perceived depth can be considered as a function of the relative distance between the foreground and background, while it is insensitive to the distance between the viewer and the depth plane at which the blur is added.
Journal article
  • Marcus Barkowsky
  • S. Tourancheau
  • K. Brunnström
  • K. Wang
  • B. Andrén

55.3: Crosstalk Measurements of Shutter Glasses 3D Displays.

In: SID Symposium Digest of Technical Papers vol. 42 pg. 812-815

Sid

  • (2011)
Crosstalk is probably one of the main perceptual factors contributing to perceived image quality and visual comfort. The Video Quality Experts Group (VQEG) within their 3D video quality project is specifying a practical measurement procedure that will produce consistent results across laboratories. This paper is part of that effort. Two different method of measuring cross talk on shutter glasses stereo displays have been studied. One is based on time average luminance measurements and the other on temporal measurements. The results show that crosstalk is roughly 0.5% but that there are differences in the crosstalk between the two eyes in the shutter glasses.
Contribution
  • J. Li
  • Marcus Barkowsky
  • P. Le Callet

The Influence of Relative Disparity and Planar Motion Velocity on Visual Discomfort of Stereoscopic Videos.

In: Proceedings of 2011 Third International Workshop on Quality of Multimedia Experience (QoMEX).

Mechelen, Belgique

  • (2011)
The vergence-accommodation conflict, excessive screen disparity,binocular distortions and the motion component in stereoscopicvideos are considered as main factors that may induce visualdiscomfort. In our previous study which was based on theexperts-only experiment, we also found that the large relativedisparity between the foreground and background and the fast planarmotion were more likely to induce visual discomfort. In this study,we conducted the same subjective experiment but on non-expertobservers. The subjective experiment results coincided with ourprevious findings. The two objective visual discomfort modelsdeveloped in our previous study have been evaluated and showedhigh correlation with subjective data. Finally, we found that theobservers could be classified into different clusters according to theirvisual discomfort sensitivity to the velocity or the relative disparity.For some observers, the velocity is the predominant factor that mayinduce visual discomfort; some consider that the relative disparity isthe key factor, and some are sensitive to both the velocity and relativedisparity.
Contribution
  • Wang, J.
  • Marcus Barkowsky
  • V. Ricordel
  • P. Le Callet

Clarifying how defocus blur and disparity affect the perceived depth.

In: Proceedings of the First Sino-French Workshop on Education and Research collaborations in Information and Communication Technologies (SIFWICT) 2011. pg. 1

Nantes, France

  • (2011)
Human visual system takes advantage of different cues simultaneously to provide us the perception of depth. When 3D images are shown on a planar stereoscopic display, binocular disparity becomes a pre-eminent depth cue. But it induces simultaneously the conflict between accommodation and vergence, which is often considered as a main reason for visual discomfort. If we limit this visual discomfort by decreasing the disparity, the apparent depth also decreases. We propose to decrease the (binocular) disparity of 3D presentations, and to reinforce (monocular) cues to compensate the loss of perceived depth and keep an unaltered apparent depth. The influence of a monocular depth cue, blur, on the apparent depth of stereoscopic scenes was studied in our recent work. We conducted a subjective experiment using a two-alternative forced choice task. Observers were required to identify the larger perceived depth in a pair of 3D images with/without blur. By fitting the result to a psychometric function, we obtained points of subjective equality in terms of disparity. We found that when blur is added to the background of the image, the viewer can perceive larger depth comparing to the images without any blur in the background. The increase of perceived depth can be considered as a function of the relative distance between the foreground and background, while it is insensitive to the distance between the viewer and the depth plane at which the blur is added.
Contribution
  • M. Urvoy
  • Marcus Barkowsky
  • J. Gutiérrez
  • R. Cousseau
  • Y. Koudota
  • V. Ricordel
  • P. Le Callet
  • N. García

NAMA3DS1-COSPAD1: Subjective video quality assessment database on coding conditions introducing freely available high quality 3D stereoscopic sequences.

In: 2012 Fourth International Workshop on Quality of Multimedia Experience.

  • (2012)
Research in stereoscopic 3D coding, transmission and subjective assessment methodology depends largely on the availability of source content that can be used in cross-lab evaluations. While several studies have already been presented using proprietary content, comparisons between the studies are difficult since discrepant contents are used. Therefore in this paper, a freely available dataset of high quality Full-HD stereoscopic sequences shot with a semiprofessional 3D camera is introduced in detail. The content was designed to be suited for usage in a wide variety of applications, including high quality studies. A set of depth maps was calculated from the stereoscopic pair. As an application example, a subjective assessment has been performed using coding and spatial degradations. The Absolute Category Rating with Hidden Reference method was used. The observers were instructed to vote on video quality only. Results of this experiment are also freely available and will be presented in this paper as a first step towards objective video quality measurement for 3DTV.
Contribution
  • W. Chen
  • J. Fournier
  • Marcus Barkowsky
  • P. Le Callet

Quality of experience model for 3DTV.

In: Proceedings of IS&T/SPIE ELECTRONIC IMAGING | Stereoscopic Displays and Applications XXIII . vol. 8288 pg. 1-6

  • Eds.:
  • A. Woods
  • G. Favalora
  • N. Holliman

San Francisco, CA, USA

  • (2012)

DOI: 10.1117/12.907873

Modern stereoscopic 3DTV brings new QoE (quality of experience) to viewers, which not only enhances the 3D sensation due to the added binocular depth, but may also induce new problems such as visual discomfort. Subjective quality assessment is the conventional method to assess the QoE. However, the conventional perceived image quality concept is not enough to reveal the advantages and the drawbacks of stereoscopic images in 3DTV. Higher-level concepts such as visual experience were proposed to represent the overall visual QoE for stereoscopic images. In this paper, both the higher-level concept quality indicator, i.e. visual experience and the basic level concepts quality indicators including image quality, depth quantity, and visual comfort are defined. We aim to explore 3D QoE by constructing the visual experience as a weight sum of image quality, depth quantity and visual comfort. Two experiments in which depth quantity and image quality are varied respectively are designed to validate this model. In the first experiment, the stimuli consist of three natural scenes and for each scene, there are four levels of perceived depth variation in terms of depth of focus: 0, 0.1, 0.2 and 0.3 diopters. In the second experiment, five levels of JPEG 2000 compression ratio, 0, 50, 100, 175 and 250 are used to represent the image quality variation. Subjective quality assessments based on the SAMVIQ method are used in both experiments to evaluate the subject's opinion in basic level quality indicators as well as the higher-level indicator. Statistical analysis of result reveals how the perceived depth and image quality variation affect different perceptual scales as well as the relationship between different quality aspects.
Contribution
  • Marcus Barkowsky
  • N. Staelens
  • L. Janowski
  • Y. Koudota
  • M. Leszczuk
  • M. Urvoy
  • P. Hummelbrunner
  • I. Sedano
  • K. Brunnström

Subjective experiment dataset for joint development of hybrid video quality measurement algorithms.

In: QoEMCS 2012 ‐ Third Workshop on Quality of Experience for Multimedia Content Sharing. pg. 1-4

Berlin

  • (2012)
The application area of an objective measurement algorithm for video quality is always limited by the scope of the video datasets that were used during its development and training. This is particularly true for measurements which rely solely on information available at the decoder side, for example hybrid models that analyze the bitstream and the decoded video. This paper proposes a framework which enables researchers to train, test and validate their algorithms on a large database of video sequences in such a way that the ‐ often limited ‐ scope of their development can be taken into consideration. A freely available video database for the development of hybrid models is described containing the network bitstreams, parsed information from these bitstreams for easy access, the decoded video sequences, and subjectively evaluated quality scores.
Contribution
  • S. Tourancheau
  • K. Wang
  • J. Bulat
  • R. Cousseau
  • L. Janowski
  • K. Brunnström
  • Marcus Barkowsky

Reproducibility of crosstalk measurements on active glasses 3D LCD displays based on temporal characterization.

In: Proceedings of SPIE Vol. 8288: Stereoscopic Displays and Applications XXIII.

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2012)
Crosstalk is one of the main display-related perceptual factors degrading image quality and causing visual dis-comfort on 3D-displays. It causes visual artifacts such as ghosting eects, blurring, and lack of color delitywhich are considerably annoying and can lead to diculties to fuse stereoscopic images. On stereoscopic LCDwith shutter-glasses, crosstalk is mainly due to dynamic temporal aspects: imprecise target luminance (highlydependent on the combination of left-view and right-view pixel color values in disparity regions) and synchro-nization issues between shutter-glasses and LCD. These dierent factors inuence largely the reproducibilityof crosstalk measurements across laboratories and need to be evaluated in several dierent locations involvingsimilar and diering conditions. In this paper we propose a fast and reproducible measurement procedure forcrosstalk based on high-frequency temporal measurements of both display and shutter responses. It permitsto fully characterize crosstalk for any right/left color combination and at any spatial position on the screen.Such a reliable objective crosstalk measurement method at several spatial positions is considered a mandatoryprerequisite for evaluating the perceptual inuence of crosstalk in further subjective studies.
Contribution
  • J. Li
  • Marcus Barkowsky
  • P. Le Callet

Analysis and Improvement of a Paired Comparison Method in the Application of 3DTV Subjective Experiment.

In: 2012 19th IEEE International Conference on Image Processing.

  • (2012)
Paired comparison is a frequently used method in psychophysical studies. However, with the increase of the number of the stimuli, the number of comparisons increases exponentially. Square design is one of the balanced sub-set paired comparison methods which could reduce the number of comparisons while producing comparably precise results under some assumptions. However, when there are observation errors from observers' attentiveness, the square design would produce large estimation errors. Thus, an improved square design which is robust to observation errors is proposed. Using a Monte Carlo simulation, the proposed method is evaluated and shows improvement in efficiency. The original design is applied in a visual discomfort subjective test of 3DTV. In addition, both of the two designs are studied by utilizing our previous full comparison data. The test results showed that the proposed improved square design is more robust to observation errors. Another important finding is that the influence of the occurrence of some other stimuli on voting is significant. Whether the proposed method could reduce the prediction errors induced by it is still under study.
Contribution
  • Y. Pitrey
  • R. Pépion
  • P. Le Callet
  • Marcus Barkowsky

Using overlapping subjective datasets to assess the performance of objective quality metrics on Scalable Video Coding and error concealment.

In: Proceedings of the 2012 Fourth International Workshop on Quality of Multimedia Experience (QoMEX). pg. 103

Melbourne, Australia

  • (2012)
In this paper, four subjective video datasets are presented. The considered application is Scalable Video Coding used as an error-concealment mechanism. The presented datasets explore the relations between encoding parameters and perceived quality, under different network-impairment patterns and involve error-concealment on the decoder\textquoterights side, to simulate a complete distribution channel. The datasets share a part of common configurations which enables, in the first part of the paper, to compare the outcomes from several Single Stimulus experiments and draw interesting correspondances between different types of distortion. In the second part of the paper, we analyse the performance of three common objective quality metrics on each step of the distribution channel, to identify the possible directions to be followed in order to improve their accuracy in predicting the perceived quality.
Contribution
  • P. Lebreton
  • A. Raake
  • Marcus Barkowsky
  • P. Le Callet

Perceptual depth indicator for S-3D content based on binocular and monocular cues.

In: 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR). pg. 734-738

  • (2012)
This article describes a general depth indicator for stereoscopic 3D video sequences. This indicator targets the characterization of the depth in 3D video sequences based on both monocular and binocular depth cues. Evaluating monocular depth cues is no easy task. Due to this high complexity, only a subset of the monocular depth cues are currently considered since it is believed that they play a major impact in depth perception. The proposed algorithm will consider the following different depth cues: binocular depth, linear perspective, blur from defocus, motion parallax and texture gradient. It will be detailed how all these all metrics have been designed. The second main contribution is the definition of a specific application scope of each metrics. This is motivated by the need to take into account the reliability of each individual metrics during the pooling. Therefore, it will be considered in the paper to identify the cases where each individual metric may fail and integrate these aspects in the general combination of the depth cues.
Contribution
  • P. Lebreton
  • A. Raake
  • U. Wustenhagen
  • T. Buchholz
  • Marcus Barkowsky
  • P. Le Callet

A subjective and objective evaluation of a realistic 3D IPTV transmission chain.

In: Proceedings of the 19th International Packet Video Workshop (PV) 2012. pg. 179

München

  • (2012)
In 3D transmissions, often a large perceptual quality gain can be achieved by slightly increasing the bitrate. However, at a certain bitrate, a saturation effect is noted and further increasing the bitrate does not lead to significant improvements of Quality of Experience (QoE). This bitrate will be called quality saturation bitrate. The purpose of this paper is to investigate a subjective and objective method to determine the quality saturation bitrate. An evaluation is presented which uses a wide spread of content types and a realistic transmission chain that includes a hardware encoder and commercial Set-Top-Boxes. A subjective assessment for various bitrates is performed using the SAMVIQ methodology and the results are also compared to objective measurements with VQM and VQUAD.
Contribution
  • Marcus Barkowsky
  • M. Blestel
  • M. Carnec
  • A. Ksentini
  • P. Le Callet
  • G. Madec
  • R. Monnier
  • J. Nezan
  • R. Pepion
  • Y. Pitrey
  • J-F Travers
  • M. Raulet
  • A. Untersee

An Overview of the SVC4QoE project.

In: Mobile Multimedia Communications. 6th International ICST Conference, MOBIMEDIA 2010, Lisbon, Portugal, September 6-8, 2010. Revised Selected Papers (Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering) pg. 560-570

  • Eds.:
  • C. Verikoukis
  • J. Rodriguez
  • R. Tafazolli

Springer

  • (2012)
The aim of this paper is to give an overview of the SVC4QoE project which purpose is to use Scalable Video Coding (SVC) to optimize video transmission in terms of Quality of Experience over DVB-T2 channels. The originality of this project is to consider the influence of the whole chain of processing and delivery performed on the video data in terms of user-perceived quality. The encoding process, as well as transmission, decoding and display are included in the optimization process. Particularly, the multi-layer structure of SVC is to be exploited to circumvent alterations of the video related to transmission on error-prone networks. Combining SVC with QoE thus provides a way to reduce network operating costs, while increasing the quality from the user’s point of view. Several innovations are also to be mentioned, such as the use of a real-time open-source SVC decoder, evaluation of visual quality through subjective quality assessment tests, transmission and synchronization of SVC layers on the receiver’s side, and development and integration of an end-to-end DVB-T2 chain which implements the multi-PLP (Physical Layer Pipes) functionality in an SVC context.
Journal article
  • K. Wang
  • Marcus Barkowsky
  • K. Brunnström
  • M. Sjöström
  • R. Cousseau
  • P. Le Callet

Perceived 3D TV Transmission Quality Assessment: Multi-Laboratory Results Using Absolute Category Rating on Quality of Experience Scale.

In: IEEE Transactions on Broadcasting vol. 58 pg. 544-557

  • (2012)
Inspired by the rapidly increasing popularity of 3D movies, there is an industrial push for 3DTV services to the home. One important factor for the success and acceptance by the viewers is a positive quality of experience (QoE) of the new service when delivered. The questions of how to efficiently deliver 3DTV service to the home, and how to evaluate the visual quality perceived by end users are a recent research focus. We have investigated users\textquoteright experience of stereoscopic 3D video quality by preparing two subjective assessment datasets. The first dataset aimed at the evaluation of efficient transmission in the transmission error free case, while the second focused on error concealment. A total of three subjective assessments, two for first dataset and one for the second, were performed using the Absolute Category Rating with Hidden unimpaired Reference video (ACR-HR) method. The experimental setup allows to show that the ACR-HR subjective method provides repeatable results across labs and across conditions for video quality. It was also verified that MVC is more efficient than H.264 simulcast coding. Furthermore it was discovered that based on the same level of quality of experience spatial down-sampling may lead to better bitrate efficiency while temporal down-sampling is not acceptable. When network impairments occur, traditional error 2D concealment methods need to be reinvestigated as they were outperformed by displaying the same view for both eyes (switching to 2D presentation).
Journal article
  • P. Lebreton
  • A. Raake
  • Marcus Barkowsky
  • P. Le Callet

Evaluating Depth Perception of 3D Stereoscopic Videos.

In: IEEE Journal of Selected Topics in Signal Processing vol. 6 pg. 710-720

  • (2012)
3D video quality of experience (QoE) is a multidimensional problem; many factors contribute to the global rating like image quality, depth perception and visual discomfort. Due to this multidimensionality, it is proposed in this paper, that as a complement to assessing the quality degradation due to coding or transmission, the appropriateness of the non-distorted signal should be addressed. One important factor here is the depth information provided by the source sequences. From an application-perspective, the depth-characteristics of source content are of relevance for pre-validating whether the content is suitable for 3D video services. In addition, assessing the interplay between binocular and monocular depth features and depth perception are relevant topics for 3D video perception research. To achieve the evaluation of the suitability of 3D content, this paper describes both a subjective experiment and a new objective indicator to evaluate depth as one of the added values of 3D video.
Contribution
  • Y. Pitrey
  • P. Hummelbrunner
  • B. Kitzinger
  • S. Buchinger
  • Marcus Barkowsky
  • P. Le Callet
  • R. Pepion

Influence of Shooting Conditions, Re-Encoding and Viewing Conditions on the Perceived Quality of User-Generated Videos.

In: Sixth International Workshop on Video Processing and Quality Metrics for Consumer Electronics - VPQM 2012 (Jan 2012; Scottsdale, AZ, USA).

  • (2012)
Contribution
  • K. Brunnström
  • I. Sedano
  • K. Wang
  • Marcus Barkowsky
  • M. Kihl
  • P. Le Callet
  • Patrick
  • M. Sjöström
  • A. Aurelius

2D No-Reference Video Quality Model Development and 3D Video Transmission Quality.

In: Sixth International Workshop on Video Processing and Quality Metrics for Consumer Electronics - VPQM 2012 (Jan 2012; Scottsdale, AZ, USA).

  • (2012)
This presentation will target two different topics in video quality assessment. First, we discuss 2D no-reference video quality model development. Further, we discuss how to find suitable quality for 3D video transmission. No-reference metrics are the only practical option for monitoring of 2D video quality in live networks. In order to decrease the development time, it might be possible to use full-reference metrics for this purpose. In this work, we have evaluated six full-reference objective metrics in three different databases. We show statistically that VQM performs the best. Further, we use these results to develop a lightweight no-reference model. We have also investigated users' experience of stereoscopic 3D video quality by performing the rating of two subjective assessment datasets, targeting in one dataset efficient transmission in the transmission error free case and error concealment in the other. Among other results, it was shown that, based on the same level of quality of experience, spatial down-sampling may lead to better bitrate efficiency while temporal down-sampling will be worse. When network impairments occur, traditional error 2D concealment methods need to be reinvestigated as they were outperformed switching to 2D presentation.
Contribution
  • W. Chen
  • J. Fournier
  • Marcus Barkowsky
  • P. Le Callet

Exploration of Quality of Experience of Stereoscopic Images: Binocular Depth.

In: Sixth International Workshop on Video Processing and Quality Metrics for Consumer Electronics - VPQM 2012 (Jan 2012; Scottsdale, AZ, USA).

  • (2012)
Journal article
  • M. Pinson
  • L. Janowski
  • R. Pepion
  • Q. Huynh-Thu
  • C. Schmidmer
  • P. Corriveau
  • A. Younkin
  • P. Le Callet
  • Marcus Barkowsky
  • W. Ingram

The Influence of Subjects and Environment on Audiovisual Subjective Tests: An International Study.

In: IEEE Journal of Selected Topics in Signal Processing vol. 6 pg. 640-651

  • (2012)

DOI: 10.1109/JSTSP.2012.2215306

Traditionally, audio quality and video quality are evaluated separately in subjective tests. Best practices within the quality assessment community were developed before many modern mobile audiovisual devices and services came into use, such as internet video, smart phones, tablets and connected televisions. These devices and services raise unique questions that require jointly evaluating both the audio and the video within a subjective test. However, audiovisual subjective testing is a relatively under-explored field. In this paper, we address the question of determining the most suitable way to conduct audiovisual subjective testing on a wide range of audiovisual quality. Six laboratories from four countries conducted a systematic study of audiovisual subjective testing. The stimuli and scale were held constant across experiments and labs; only the environment of the subjective test was varied. Some subjective tests were conducted in controlled environments and some in public environments (a cafeteria, patio or hallway). The audiovisual stimuli spanned a wide range of quality. Results show that these audiovisual subjective tests were highly repeatable from one laboratory and environment to the next. The number of subjects was the most important factor. Based on this experiment, 24 or more subjects are recommended for Absolute Category Rating (ACR) tests. In public environments, 35 subjects were required to obtain the same Student\textquoterights t-test sensitivity. The second most important variable was individual differences between subjects. Other environmental factors had minimal impact, such as language, country, lighting, background noise, wall color, and monitor calibration. Analyses indicate that Mean Opinion Scores (MOS) are relative rather than absolute. Our analyses show that the results of experiments done in pristine, laboratory environments are highly representative of those devices in actual use, in a typical user environment.
Contribution
  • Y. Pitrey
  • Marcus Barkowsky
  • R. Pépion
  • P. Le Callet
  • H. Hlavacs

Influence of the source content and encoding configuration on the preceived quality for scalable video coding.

In: Proceedings of SPIE Vol. 8291: Human Vision and Electronic Imaging XVII. pg. 1-6

  • Eds.:
  • SPIE - The International Society for Optical Engineering

San Francisco, CA, USA

  • (2012)
In video coding, it is commonly accepted that the encoding paramaters such as the quantization step-size have an influence on the perceived quality. When dealing with Scalable Video Coding (SVC), the parameters used to encode each layer logically have an influence on the overall perceived quality. It is also commonly accepted that using given encoding parameters, the perceived quality does not change significantly according to the encoded source content. In this paper, we evaluate the impact of both SVC coding artifacts and source contents on the quality perceived by human observers. We exploit the outcomes of two subjective experiments designed and conducted under standard conditions in order to provide reliable results. The two experiments are aligned on a common scale using a set of shared processed video sequences, resulting in a database containing the subjective scores for 60 different sources combined with 20 SVC scenarios. We analyse the performance of several source descriptors in modeling the relative behaviour of a given source content when compared to the average of other source contents.
Contribution
  • K. Wang
  • K. Brunnström
  • Marcus Barkowsky
  • M. Urvoy
  • M. Sjöström
  • P. Le Callet
  • S. Tourancheau
  • B. Andrén

Stereoscopic 3D video coding quality evaluation with 2D objective metrics.

In: Proceedings of SPIE Vol. 8648: Stereoscopic Displays and Applications XXIV.

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2013)
The 3D video quality is of highest importance for the adoption of a new technology from a user’s point of view. In this paper we evaluated the impact of coding artefacts on stereoscopic 3D video quality by making use of several existing full reference 2D objective metrics. We analyzed the performance of objective metrics by comparing to the results of subjective experiment. The results show that pixel based Visual Information Fidelity metrics fits subjective data the best. The 2D stereoscopic video quality seems to have dominant impact on the coding artefacts impaired stereoscopic videos.
Journal article
  • J. Li
  • Marcus Barkowsky
  • P. Le Callet

Recent Advances in Standardization on 3D Quality of Experience.

In: IEEE COMSOCMMTC E-Letter vol. 8 pg. 20

  • (2013)
For the last decades, video quality assessment has mostly tackled 2D video sequences. Technological advances were mostly tackling coding and transmission schemes while the display technology, especially in lab viewing environments, could be considered as transparent. Subjective assessment methodologies needed to be selected mostly with respect to the severity of the degradations. Typical examples are Absolute Category Rating with Hidden Reference (ACR-HR) from ITU-T P.910 for strong degradations experienced in networked multimedia scenarios, and Paired Comparison (PC) or Double Stimulus Continuous Quality Scale (DSCQS) from ITU-R BT.500 for near lossless scenarios such as satellite transmissions.
Contribution
  • J. Li
  • Marcus Barkowsky
  • P. Le Callet

Visual Discomfort is not always proportonal to Eye Blinking Rate: Exploring Some Effects of Planar and In-Depth Motion on 3DTV QoE.

In: International Workshop on Video Processing and Quality Metrics for Consumer Electronics VPQM 2013 (Jan 2013; Scottsdale, AZ, USA).

  • (2013)
Visual discomfort is an important factor in determiningQoE in 3DTV. It can be measured by physiological signals. Inthis study, the relationship between 3D video characteristics(e.g., motion type, disparity, velocity, etc), visual discom-fort and eye blinking rate were studied. Three motion typeswere considered, which were static scenes, planar motionand in-depth motion. 44 stimuli with different motion types,disparity levels and velocity levels were studied. The eyeblinking signals of 28 observers were obtained by an electro-physiological measurement device. The experimental resultsshowed that stimulus velocity affected eye blinks significantlyand differently for planar motion stimuli and in-depth motionstimuli. The objective eye blinking model for 3D stimuliwas developed in function of the 3D video characteristics.Furthermore, the results showed that eye blinking rate wasproportional to the visual discomfort of the static 3D stimulibut inversely proportional to the visual discomfort of planarmotion stimuli.
Contribution
  • J. Paulus
  • G. Michelson
  • Marcus Barkowsky
  • J. Hornegger
  • B. Eskofier
  • M. Schmidt

Measurement of Individual Changes in the Performance of Human Stereoscopic Vision for Disparities at the Limits of the Zone of Comfortable Viewing.

In: 2013 International Conference on 3D Vision (3DV 2013). pg. 310-317

  • (2013)
3D displays enable immersive visual impressions but the impact on the human perception still is not fully understood. Viewing conditions like the convergence-accommodation (C-A) conflict have an unnatural influence on the visual system and might even lead to visual discomfort. As visual perception is individual we assumed the impact of simulated 3D content on the visual system to be as well. In this study we aimed to analyze the stereoscopic visual performance of 17 subjects for disparities inside and outside the in literature defined zone of comfortable viewing to provide an individual evaluation of the impact of increased disparities on the performance of the visual system. Stereoscopic stimuli were presented in a four-alternative forced choice (4AFC) setup in different disparities. The response times as well as the correct decision rates indicated the performance of stereoscopic vision. The results showed that increased disparities lead to a decline in performance. Further, the impact of the presented disparities is dependent on the difficulty of the task. The decline of performance as well as the deciding disparities for the decline were subject dependent.
Journal article
  • M. Urvoy
  • Marcus Barkowsky
  • P. Le Callet

How visual fatigue and discomfort impact 3D-TV quality of experience: a comprehensive review of technological, psychophysical, and psychological factors.

In: Annales des Télécommunications vol. 68 pg. 641-655

  • (2013)

DOI: 10.1007/s12243-013-0394-3

The quality of experience (QoE) of 3D contents is usually considered to be the combination of the perceived visual quality, the perceived depth quality, and lastly the visual fatigue and comfort. When either fatigue or discomfort are induced, studies tend to show that observers prefer to experience a 2D version of the contents. For this reason, providing a comfortable experience is a prerequisite for observers to actually consider the depth effect as a visualization improvement. In this paper, we propose a comprehensive review on visual fatigue and discomfort induced by the visualization of 3D stereoscopic contents, in the light of physiological and psychological processes enabling depth perception. First, we review the multitude of manifestations of visual fatigue and discomfort (near triad disorders, symptoms for discomfort), as well as means for detection and evaluation. We then discuss how, in 3D displays, ocular and cognitive conflicts with real world experience may cause fatigue and discomfort; these includes the accommodation-vergence conflict, the inadequacy between presented stimuli and observers depth of focus, and the cognitive integration of conflicting depth cues. We also discuss some limits for stereopsis that constrain our ability to perceive depth, and in particular the perception of planar and in-depth motion, the limited fusion range, and various stereopsis disorders. Finally, this paper discusses how the different aspects of fatigue and discomfort apply to 3D technologies and contents. We notably highlight the need for respecting a comfort zone and avoiding camera and rendering artifacts. We also discuss the influence of visual attention, exposure duration, and training. Conclusions provide guidance for best practices and future research.
Journal article
  • W. Chen
  • J. Fournier
  • Marcus Barkowsky
  • P. Le Callet

Methodologies for Assessing 3D QoE: Standards and Explorative Studies.

In: ZTE Communications vol. 11 pg. 2-10

  • (2013)
Mastering quality of experience (QoE) is key to the widespread adoption of stereoscopic 3DTV (S-3DTV). However, assessing QoE of S-3DTV is not straightforward. Methods for determining observer experience need to be clearly defined and sufficiently robust. In this paper, we present state-of-the-art subjective QoE assessment for S-3DTV. We present conventional stan⁃ dardized ITU recommendations for evaluating picture quality and discuss new ITU activities in the area of S-3DTV assess⁃ ment. We also present and discuss explorative studies from the literature. We then introduce ways of using conventional quality assessment for S-3DTV QoE assessment. In discussing our pro⁃ posal, we mainly focus on QoE indicators and common features of subjective assessment. Multidimensional QoE indicators need to be used in S-3DTV to highlight advantages and reveal problems. In the second part of our proposal, we discuss the re⁃ quirements for adapting ITU-R BT.500, a conventional subjec⁃ tive QoE assessment method, ITU-R BT.500, for assessing QoE of S-3DTV are presented.
Contribution
  • M. Urvoy
  • Marcus Barkowsky
  • J. Li
  • P. Le Callet

Visual Comfort and Fatigue in Stereoscopy.

In: 3D Video: From Capture to Diffusion. pg. 309-329

  • Eds.:
  • L. Lucas
  • Y. Remion
  • C. Loscos

ISTE Ltd, Wiley London, UK

  • (2013)
Contribution
  • Marcus Barkowsky
  • N. Staelens
  • L. Janowski

Open collaboration on hybrid video quality models ‐ VQEG joint effort group hybrid.

In: 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP). pg. 476-481

  • (2013)
Several factors limit the advances on automatizing video quality measurement. Modelling the human visual system requires multi- and interdisciplinary efforts. A joint effort may bridge the large gap between the knowledge required in conducting a psychophysical experiment on isolated visual stimuli to engineering a universal model for video quality estimation under real-time constraints. The verification and validation requires input reaching from professional content production to innovative machine learning algorithms. Our paper aims at highlighting the complex interactions and the multitude of open questions as well as industrial requirements that led to the creation of the Joint Effort Group in the Video Quality Experts Group. The paper will zoom in on the first activity, the creation of a hybrid video quality model.
Contribution
  • M. Pinson
  • C. Schmidmer
  • L. Janowski
  • R. Pepion
  • Q. Huynh-Thu
  • P. Corriveau
  • A. Younkin
  • P. Le Callet
  • Marcus Barkowsky
  • W. Ingram

Subjective and objective evaluation of an audiovisual subjective dataset for research and development.

In: 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX). pg. 30-31

  • (2013)
In 2011, the Video Quality Experts Group (VQEG) ran subjects through the same audiovisual subjective test at six different international laboratories. That small dataset is now publically available for research and development purposes.
Journal article
  • K. Brunnström
  • I. Ananth
  • C. Hedberg
  • K. Wang
  • B. Andrén
  • Marcus Barkowsky

36.4: Comparison between Different Rating Scales for 3D TV.

In: SID Symposium Digest of Technical Papers vol. 44 pg. 509-512

Blackwell Publishing Ltd

  • (2013)

DOI: 10.1002/j.2168-0159.2013.tb06256.x

Stereoscopic 3D viewing experience has been studied quite intensively recently, but still the subjective test methods have not yet been settled. It has become clear that the 3D viewing experience cannot easily be described by just one scale. This paper describes a study where three different rating scales (Quality, Discomfort and Presence) are compared in a subjective test, combined with two viewing distances. The results shows that in a stereoscopic 3D video quality test targeting mainly coding distortions one scale such as video quality could be sufficient.
Journal article
  • M. Pinson
  • Marcus Barkowsky
  • P. Le Callet

Selecting scenes for 2D and 3D subjective video quality tests.

In: EURASIP Journal on Image and Video Processing pg. Article number: 50

  • (2013)
This paper presents recommended techniques for choosing video sequences for subjective experiments. Subjective video quality assessment is a well-understood field, yet scene selection is often driven by convenience or content availability. Three-dimensional testing is a newer field that requires new considerations for scene selection. The impact of experiment design on best practices for scene selection will also be considered. A semi-automatic selection process for content sets for subjective experiments will be proposed.
Contribution
  • J. Li
  • Marcus Barkowsky
  • P. Le Callet

Boosting Paired Comparison methodology in measuring visual discomfort of 3DTV: Performances of three different designs.

In: Proceedings of SPIE Vol. 8648: Stereoscopic Displays and Applications XXIV.

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2013)
The pair comparison method is often recommended in subjective experiments because of the reliability of the obtained results. However, a drawback of this method is that the number of comparisons increases exponentially with the number of stimuli, which limits its usability for a large number of stimuli. Several design methods that aim to reduce the number of comparisons were proposed in the literature. However, their performances in the context of 3DTV should be evaluated carefully due to the fact that the results obtained from a paired comparison experiment in 3DTV may be influenced by two important factors. One is the observation error from observer's attentiveness, in particular inverting the vote. The second factor concerns the dependence on the context in which the evaluation takes place. In this study, three design methods, namely Full Paired Comparison method (FPC), Square Design method (SD) and the Adaptive Square Design method (ASD) were evaluated by subjective visual discomfort experiment in 3DTV. The results from the FPC method were considered as the ground truth. Comparing with the ground truth, the ASD method provided the most accurate results with a given number of trials. It also showed the highest robustness against observation errors and interdependence of comparisons. Due to the efficiency of the ASD method, paired comparison experiments become feasible with a reasonably large number of stimuli for measuring 3DTV visual discomfort.
Contribution
  • Marcus Barkowsky
  • J. Li
  • T. Han
  • S. Youn
  • J. Ok
  • C. Lee
  • I. Vijai Ananth
  • K. Wang
  • K. Brunnström
  • P. Le Callet

Towards standardized 3DTV QoE assessment: Cross-lab study on display technology and viewing environment parameters.

In: Proceedings of SPIE Vol. 8648: Stereoscopic Displays and Applications XXIV;.

  • Eds.:
  • SPIE - The International Society for Optical Engineering

  • (2013)
Subjective assessment of Quality of Experience in stereoscopic 3D requires new guidelines for the environmental setup as existing standards such as ITU-R BT.500 may no longer be appropriate. A first step is to perform cross-lab experiments in different viewing conditions on the same video sequences. Three international labs performed Absolute Category Rating studies on a freely available video database containing degradations that are mainly related to video quality degradations. Different conditions have been used in the labs: Passive polarized displays, active shutter displays, differences in viewing distance, the number of parallel viewers, and the voting device. Implicit variations were introduced due to the three different languages in Sweden, South Korea, and France. Although the obtained Mean Opinion Scores are comparable, slight differences occur in function of the video degradations and the viewing distance. An analysis on the statistical differences obtained between the MOS of the video sequences revealed that obtaining an equivalent number of differences may require more observers in some viewing conditions. It was also seen that the alignment of the meaning of the attributes used in Absolute Category Rating in different languages may be beneficial. Statistical analysis was performed showing influence of the viewing distance on votes and MOS results.
Contribution
  • J. Li
  • Marcus Barkowsky
  • P. Le Callet

Subjective Assessment Methodology For Preference Of Experience In 3dtv.

In: Proceedings of the 11th IEEE IVMSP Workshop : 3D Image/Video Technologies and Applications. pg. 1

Seoul, South Korea

  • (2013)
The measurement of the Quality of Experience (QoE) in 3DTV re- cently became an important research topic as it relates to the devel- opment of the 3D industry. Pair comparison is a reliable method as it is easier for the observers to provide their preference on a pair rather than give an absolute scale value to a stimulus. The QoE measured by pair comparison is thus called \textquoteright. In this paper, we introduce some efficient designs for pair compari- son which can reduce the number of comparisons. The constraints of the presentation order of the stimuli in pair comparison test are listed. Finally, some analysis methods for pair comparison data are provided accompanied with some examples from the studies of the measurement of PoE.
Contribution
  • M. Leszczuk
  • L. Janowski
  • Marcus Barkowsky

Freely Available Large-scale Video Quality Assessment Database in Full-HD Resolution with H.264 Coding.

In: Proceedings of the 2013 IEEE Globecom Workshops (GC Wkshps). vol. - pg. 1

Atlanta, GA, USA

  • (2013)
Video databases often focus on a particular use case with a limited set of sequences. In this paper, a different type of database creation is proposed: an exhaustive number of test conditions will be continuously created and made freely available for objective and subjective evaluation. At the moment, the database comprises more than ten thousand JM/x264-encoded video sequences. An extensive study of the possible encoding parameter space led to a first subset selection of 1296 configura- tions. At the moment, only ten source sequences have been used, but extension to more than one hundred sequences is planned. Some Full-Reference (FR) and No-Reference (NR) metrics were selected and calculated. The resulting data will be freely available to the research community and possible exploitation areas are suggested.
Contribution
  • Marcus Barkowsky
  • K. Brunnström
  • T. Ebrahimi
  • L. Karam
  • P. Lebreton
  • P. Le Callet
  • A. Perkis
  • A. Raake
  • M. Subedar
  • K. Wang
  • L. Xing
  • J. You

Subjective and Objective Visual Quality Assessment in the Context of Stereoscopic 3D-TV.

In: 3D-TV System with Depth-Image-Based Rendering. pg. 413-437

  • Eds.:
  • L. Yu
  • C. Zhu
  • M. Tanimoto
  • Y. Zhao

Springer New York

  • (2013)

DOI: 10.1007/978-1-4419-9964-1_14

Contribution
  • P. Lebreton
  • A. Raake
  • Marcus Barkowsky
  • P. Le Callet

Perceptual preference of S3D over 2D for HDTV in dependence of video quality and depth.

In: 2013 IEEE 11th IVMSP Workshop. pg. 1-4

  • (2013)
3D video quality of experience (QoE) is a multidimensional problem and many factors contribute to the global experience by the user. Due to this multidimensionality, this paper evaluates the integral 3D video QoE and relates it with image quality and depth. Subjective tests have been conducted using paired comparison to evaluate 3D QoE and the preference of 3D over 2D with different combinations of coding conditions. Depth scores were available from previous work and were used to check their relation with 3DQoE; the difference between 2D and 3D QoE is found to be a function of the picture quality, and the desired preference of 3D presentation over 2D can be reached when pictorial quality is high enough (VQM score lower than 0.24). A factor ranging from 0.08 to 0.76 with a mean of 0.71 between pictorial quality and preference of 3D was also found.
Journal article
  • P. Le Callet
  • Marcus Barkowsky

On viewing distance and visual quality assessment in the age of Ultra High Definition TV.

In: VQEG (Video Quality Expert Group) eLetter vol. 1 pg. 25-30

Video Quality Expert Group

  • (2014)
The consumer video market is largely driven by the introduction of new formats(e.g.,new pixel resolution). Each time,the story remains the same: what is the optimal viewing distance? Ultra High Definition TV is not an exception. This simple question is of crucial importance when it comes to the issue of quality and the added value of a new technology. In this letter, werevisit the topic,starting from bestpractices and then raising open questions.
Journal article
  • J. Li
  • Marcus Barkowsky
  • P. Le Callet

Validation of reliable 3DTV subjective assessment methodology ‐ Establishing a Ground Truth Database.

In: VQEG (Video Quality Expert Group) eLetter vol. 1

  • (2014)
Quality of Experience (QoE) in 3DTV is a multi-dimensional concept which includes image quality, depth quality, and visual comfort. How to measure this multi-dimensional concept is a challenging issue nowadays. In this letter, we introduce a Ground Truth database which is targeted for the standardization of subjective methodologies for QoE in 3DTV.
Contribution
  • P. Lebreton
  • A. Raake
  • Marcus Barkowsky
  • P. Le Callet

Evaluating complex scales through subjective ranking.

In: 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX). pg. 303-308

  • (2014)
In this paper, a subjective assessment methodology based on ranking, reordering and image categorization is proposed. It is designed to facilitate the evaluation of complex features such as monocular depth cues in natural images. These features can be particularly challenging to explain to test participants, and the proposed methodology is designed to provide to the test participants more insights and examples about the scale under evaluation. The performances of the method are compared to the absolute category rating methodology, and the advantages of each part of the test design are studied to verify the added value of the different steps of the evaluation procedure. Results show that the method is promising and can improve stability between subjective test participant results.
Contribution
  • J. Li
  • Marcus Barkowsky
  • P. Le Callet

Assessing the Quality of Experience of 3DTV and Beyond: Tackling the Multidimensional Sensation.

In: 3D Future Internet Media. pg. 201-222

  • Eds.:
  • T. Dagiuklas
  • A. Kondoz

Springer New York

  • (2014)
Journal article
  • J. Li
  • Marcus Barkowsky
  • P. Callet

Visual discomfort of stereoscopic 3D videos: Influence of 3D motion.

In: Displays vol. 35 pg. 49-57

  • (2014)

DOI: 10.1016/j.displa.2014.01.002

Visual discomfort is one of the most frequent complaints of the viewers while watching 3D images and videos. Large disparity and large amount of motion are two main causes of visual discomfort. To quantify this influence, three objectives are set in this paper. The first one is the comparative analysis on the influence of different types of motion, i.e., static stereoscopic image, planar motion and in-depth motion, on visual discomfort. The second one is the investigation on the influence factors for each motion type, for example, the disparity offset, the disparity amplitude and velocity. The third one is to propose an objective model for visual discomfort. Thirty-six synthetic stereoscopic video stimuli with different types of motion are used in this study. In the subjective test, an efficient paired comparison method called Adaptive Square Design (ASD) was used to reduce the number of comparisons for each observer and keep the results reliable. The experimental results showed that motion does not always induce more visual discomfort than static conditions. The in-depth motion generally induces more visual discomfort than the planar motion. The relative disparity between the foreground and the background, and the motion velocity are identified as main factors for visual discomfort. According to the subjective results, an objective model for comparing visual discomfort induced by different types of motion is proposed which shows high correlation with the subjective perception.
Contribution
  • K. Zhu
  • Marcus Barkowsky
  • M. Shen
  • P. Callet
  • D. Saupe

Optimizing feature pooling and prediction models of VQA algorithms.

In: 2014 IEEE International Conference on Image Processing (ICIP). pg. 541-545

  • (2014)
In this paper, we propose a strategy to optimize feature pooling and prediction models of video quality assessment (VQA) algorithms with a much smaller number of parameters than methods based on machine learning, such as neural networks. Based on optimization, the proposed mapping strategy is composed of a global linear model for pooling extracted features, a simple linear model for local alignment in which local factors depend on source videos, and a non-linear model for quality calibration. Also, a reduced-reference VQA algorithm is proposed to predict the local factors from the source video. In the IRCCyN/IVC video database of content influence and the LIVE mobile video database, the performance of VQA algorithms is improved significantly by local alignment. The proposed mapping strategy with prediction of local factors outperforms one no-reference VQA metric and is comparable to one full-reference VQA metric. Thus predicting the local factors in local alignment based on video content will be a promising new approach for VQA.
Contribution
  • P. Lebreton
  • A. Raake
  • Marcus Barkowsky
  • P. Le Callet

Measuring perceived depth in natural images and study of its relation with monocular and binocular depth cues.

In: Proceedings of SPIE Vol. 9011: IS&T/SPIE Electronic Imaging Stereoscopic Displays and Applications XXV . pg. 1

  • Eds.:
  • SPIE - The International Society for Optical Engineering

San Francisco, CA, USA

  • (2014)
The perception of depth in images and video sequences is based on different depth cues. Studies have considered depth perception threshold as a function of viewing distance (Cutting & Vishton,1995), the combination of different monocular depth cues and their quantitative relation with binocular depth cues and their different possible type of interactions (Landy,1995). But these studies only consider artificial stimuli and none of them attempts to provide a quantitative contribution of monocular and binocular depth cues compared to each other in the specific context of natural images. This study targets this particular application case. The evaluation of the strength of different depth cues compared to each other using a carefully designed image database to cover as much as possible different combinations of monocular (linear perspective, texture gradient, relative size and defocus blur) and binocular depth cues. The 200 images were evaluated in two distinct subjective experiments to evaluate separately perceived depth and different monocular depth cues. The methodology and the description of the definition of the different scales will be detailed. The image database is also released for the scientific community.
Contribution
  • J. Li
  • Y. Koudota
  • Marcus Barkowsky
  • H. Primon
  • P. Le Callet

Comparing Upscaling Algorithms From Hd To Ultra Hd By Evaluating Preference Of Experience.

In: 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX).

  • (2014)
As the next generation of TV, Ultra High Definition Television (UHDTV) is attracting more and more people's attention as it provides a new viewing experience. Considering content delivery, due to the lack of Ultra HD resources, a direct question for the industry is that whether the state-of-the-art upscaling algorithms can be utilized to upscale the current HD or Full HD resources to UHD, gaining benefit from the higher resolution but without losing the high quality viewing experience. To investigate this, in this study, we upscaled 720p and 1080p sequences to UHD resolution by different upscaling algorithms. Paired Comparison methodology was used in the subjective experiment to evaluate their performances. The results showed that for the case of fast motion content, viewers' preference on different upscaled video sequences is not significantly different. In general conditions, the low complexity upscaling algorithms (e.g., lanczos-3) performed better than the high complexity algorithms (e.g., Robust Super Resolution algorithm). A novel upscaling algorithm adapted to UHD is recommended to be developed based on the mechanisms of human visual system.
Journal article
  • G. van Wallendael
  • N. Staelens
  • E. Masala
  • L. Janowski
  • K. Berger
  • Marcus Barkowsky

Dreamed about training, verifying and validating your QoE model on a million videos?.

In: VQEG (Video Quality Expert Group) eLetter vol. 1 pg. 19-29

  • (2014)
Contribution
  • P. Lebreton
  • Marcus Barkowsky
  • A. Raake
  • P. Le Callet

Chapter 20: 3D Video.

In: Quality of Experience: Advanced Concepts, Applications and Methods. (T-Labs Series in Telecommunication Services) pg. 299-313

  • Eds.:
  • A. Raake
  • S. Möller

Springer International Publishing

  • (2014)
Journal article
  • Marcus Barkowsky
  • I. Sedano
  • K. Brunnström
  • M. Leszczuk
  • N. Staelens

Hybrid video quality prediction: reviewing video quality measurement for widening application scope.

In: Multimedia Tools and Applications vol. 74 pg. 323-343

Springer US

  • (2014)

DOI: 10.1007/s11042-014-1978-2

A tremendous number of objective video quality measurement algorithms have been developed during the last two decades. Most of them either measure a very limited aspect of the perceived video quality or they measure broad ranges of quality with limited prediction accuracy. This paper lists several perceptual artifacts that may be computationally measured in an isolated algorithm and some of the modeling approaches that have been proposed to predict the resulting quality from those algorithms. These algorithms usually have a very limited application scope but have been verified carefully. The paper continues with a review of some standardized and well-known video quality measurement algorithms that are meant for a wide range of applications, thus have a larger scope. Their individual artifacts prediction accuracy is usually lower but some of them were validated to perform sufficiently well for standardization. Several difficulties and shortcomings in developing a general purpose model with high prediction performance are identified such as a common objective quality scale or the behavior of individual indicators when confronted with stimuli that are out of their prediction scope. The paper concludes with a systematic framework approach to tackle the development of a hybrid video quality measurement in a joint research collaboration.
Journal article
  • J. Li
  • A. Wang
  • Wang, J.
  • Marcus Barkowsky
  • P. Le Callet

Visual Discomfort Induced by Three-Dimensional Display Technology (in Chinese).

In: Laser and Optoelectronics Progress vol. 52

  • (2015)
Contribution
  • Y. Rai
  • Marcus Barkowsky
  • P. Le Callet

Does H.265 based peri and para-foveal quality flicker disrupt natural viewing patterns?.

In: 2015 International Conference on Systems, Signals and Image Processing (IWSSIP). pg. 133-136

  • (2015)
Region of interest based coders and also those coders that extensively use the Intra coding mode tend to produce flickering artefacts due to the inherent quality fluctuations they produce in the non-salient or homogeneously textured intra coded regions of a scene respectively. These non-salient areas are in turn incident on the para, peri and extra-peri foveal retinal regions that are especially sensitive to temporal artefacts. To study the perceptual effects of such a quality based flicker in the retinal periphery, in this experiment gaze data was collected from 48 observers in a free viewing scenario using a regulated flicker stimulus. The video stimulus in the mentioned peripheral regions were flickered with a predetermined frequency and amplitude by adaptively combining the streams produced using a H.265 video coder, and presenting them in a Gaze Contingent Display(GCD) setup. Utilizing statistical gaze analysis techniques, we observe that, temporal quality flicker in the visual periphery plays a major role in disturbing the natural viewing patterns of an observer. Further, this effect was found to be maximum in the 7.5Hz temporal band thus showing that, encoders that affect temporal quality smoothness must aim to avoid such distraction effects.
Contribution
  • P. Lebreton
  • A. Raake
  • Marcus Barkowsky
  • P. Le Callet

Open perceptual binocular and monocular descriptors for stereoscopic 3D images and video characterization.

In: 2015 Seventh International Workshop on Quality of Multimedia Experience (QoMEX). pg. 1-6

  • (2015)
This paper presents a toolbox for monocular and binocular depth estimation and an analysis of performance for some of these. At first, the definition of an appropriate depth indicator (DI) metric for 3D contents is discussed. To this aim, different algorithms from the literature for the characterization of 3D videos are compared. Results show that the simple 7.5 percentile of the disparity map values can already be an indicator, even though it may fail to address several perceptual aspects. In the latter cases, more advanced algorithms presented in this paper may be a better approach. In a second step, monocular depth indicators are described and analyzed in this paper. All code and tools enabling binocular and monocular depth estimation such as depth-map estimation, depth map characterization and monocular depth cue indicator computation is provided open-source. This will enable researchers to further characterize their 3D and 2D contents, for example before running a subjective experiment, or to automatically pre-screen 3D content that is to be presented to a larger number of viewers.
Contribution
  • G. van Wallendael
  • N. Staelens
  • E. Masala
  • Marcus Barkowsky

Full-HD HEVC-Encoded Video Quality Assessment Database.

In: Ninth International Workshop on Video Processing and Quality Metrics (VPQM) [Feb 2015; Chandler, AZ, USA].

  • (2015)
Contribution
  • A. Aldahdooh
  • Marcus Barkowsky
  • P. Le Callet

The impact of complexity in the rate-distortion optimization: A visualization tool.

In: 2015 International Conference on Systems, Signals and Image Processing (IWSSIP). pg. 45-48

  • (2015)
Recently, the new video coding standard, High Efficiency Video Coding (HEVC), was released. A 50% bitrate reduction at the same visual quality relative to the previous standard was achieved. These gains are partly related to tools that increase complexity such as coding decisions taken on smaller image areas or further improvements in motion compensated coding. This paper addresses the complexity issue, i.e. the execution time, and its impact when it is combined with the bitrate and distortion optimization of the video coding. Given a set of encoder configurations that restrict the encoding such that the complexity varies, an analysis including a visualization tool is proposed to help the user to select the best configuration for a specific amount of rate, distortion and complexity. A possible targeted applications are also introduced.
Journal article
  • Marcus Barkowsky
  • E. Masala
  • G. van Wallendael
  • K. Brunnström
  • N. Staelens
  • P. Le Callet

Objective Video Quality Assessment ‐- Towards Large Scale Video Database Enhanced Model Development.

In: IEICE Transactions on Communications vol. E-98b pg. 2-11

  • (2015)
The current development of video quality assessment algorithms suffers from the lack of available video sequences for training, verification and validation to determine and enhance the algorithm's application scope. The Joint Effort Group of the Video Quality Experts Group (VQEG-JEG) is currently driving efforts towards the creation of large scale, reproducible, and easy to use databases. These databases will contain bitstreams of recent video encoders (H.264, H.265), packet loss impairment patterns and impaired bitstreams, pre-parsed bitstream information into files in XML syntax, and well-known objective video quality measurement outputs. The database is continuously updated and enlarged using reproducible processing chains. Currently, more than 70,000 sequences are available for statistical analysis of video quality measurement algorithms. New research questions are posed as the database is designed to verify and validate models on a very large scale, testing and validating various scopes of applications, while subjective assessment has to be limited to a comparably small subset of the database. Special focus is given on the principles guiding the database development, and some results are given to illustrate the practical usefulness of such a database with respect to the detailed new research questions.
Contribution
  • M. Shahid
  • J. Panasiuk
  • G. van Wallendael
  • Marcus Barkowsky
  • B. Lovstrom

Predicting full-reference video quality measures using HEVC bitstream-based no-reference features.

In: 2015 Seventh International Workshop on Quality of Multimedia Experience (QoMEX). pg. 1-2

  • (2015)
This paper presents bitstream-based features for perceptual quality estimation of HEVC coded videos. Various factors including the impact of different sizes of block-partitions, use of reference-frames, the relative amount of various prediction modes, statistics of motion vectors and quantization parameters are taken into consideration for producing 52 features relevant for perceptual quality prediction. The used test stimuli constitutes 560 bitstreams that have been carefully extracted for this analysis from the 59, 520 bistreams of the large-scale database generated by the Joint Effort Group (JEG) of the Video Quality Experts Group (VQEG). The obtained results show the significance of the considered features through reasonably accurate and monotonic prediction of a number of objective quality metrics.
Contribution
  • K. Berger
  • Y. Koudota
  • Marcus Barkowsky
  • P. Le Callet

Subjective Quality Assessment Comparing UHD and HD Resolution in HEVC Transmission Chain.

In: 2015 Seventh International Workshop on Quality of Multimedia Experience (QoMEX).

  • (2015)
Ultra High Definition Television (UHDTV) is an emerging broadcasting system, aiming to replace the current High Definition Television (HDTV) in the near future. One aspect of UHDTV is to allow for higher resolutions, notably UHD1, which requires a four times increased datarate for uncompressed transmission compared to Full-HD resolution. As bandwidth for transmission channels in television broadcast is often a fixed value, this study provides information about the perceived quality for transmitting UHD1 content compared to Full-HD content at the same bitrate encoded with HEVC. Content influence is tested with 15 video contents and 4 bitrates are individually chosen per content to span the range of the perceptual scale. The methodology of Absolute Category Rating with Hidden Reference (ACR-HR) is used to collect votes from 24 viewers. The statistical analysis of the collected data shows that, in most cases, there is no significant quality difference between videos transmitted in Full-HD and UHD1 resolution but that the results strongly depend on the content type and on the capture quality. It is also shown that the required bitrate for achieving a chosen broadcast quality level varies with content by a factor of about 14 in HEVC coding.
Contribution
  • P. Lebreton
  • A. Raake
  • Marcus Barkowsky

Studying user agreement on aesthetic appeal ratings and its relation with technical knowledge.

In: 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX).

  • (2016)

DOI: 10.1109/QoMEX.2016.7498934

In this paper, a crowdsourcing experiment was conducted involving different panels of participants. The aim of this study is to evaluate how the preference of one image over another one is related with the knowledge of the participant in photography. In previous work the two discriminant evaluation concepts \textquoteright were found to distinguish group participants with different degrees of knowledge in photography. Each of these groups provided different means of aesthetic appeal ratings when asked to rate on an absolute category scale. The present paper extends previous work by studying preference ratings on a set of image pairs as a function of technical knowledge and more specifically adding a focus on the variance of rating and agreement between participants. The conducted study was composed of two different steps where the participants had to first report their preference of one image over another (paired comparison), and an evaluation of the technical background of the participant using a specific set of images. Based on preference-rating patterns groups of participants were identified. These groups were formed by clustering the participants who saw and shared the same preference rating on images in one group, and the participants with low agreement with other participants in another group. A per-group analysis showed that a high agreement between participants could be observed when participants have technical knowledge. This indicates that higher consistency between participants can be reached when expert users are being recruited, and therefore participants should be carefully selected in image aesthetic appeal evaluation to ensure stable results.
Contribution
  • A. Aldahdooh
  • E. Masala
  • O. Janssens
  • G. Wallendael
  • Marcus Barkowsky

Comparing simple video quality measures for loss-impaired video sequences on a large-scale database.

In: 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX). pg. 1-6

  • (2016)
The performance of objective video quality measures is usually identified by comparing their predictions to subjective assessment results which are regarded as the ground truth. In this work we propose a complementary approach for this performance evaluation by means of a large-scale database of test sequences evaluated with several objective measurement algorithms. Such an approach is expected to detect performance anomalies that could highlight shortcomings in current objective measurement algorithms. Using realistic coding and network transmission conditions, we investigate the consistency of the prediction of different measures as well as how much their behavior can be predicted by content, coding and transmission features, discussing unexpected and peculiar behaviors, and highlighting how a large-scale database can help in identifying anomalies not easily found by means of subjective testing. We expect that this analysis will shed light on directions to pursue in order to overcome some of the limitations of existing reliability assessment methods for objective video quality measures.
Contribution
  • P. Lebreton
  • A. Raake
  • Marcus Barkowsky

Evaluation of aesthetic appeal with regard of user\textquoterights knowledge.

In: Proceedings of SPIE Conference Human Vision and Electronic Imaging 2016 (HVEI 2016).

  • (2016)
Contribution
  • Y. Rai
  • A. Aldahdooh
  • S. Ling
  • Marcus Barkowsky
  • P. Callet

Effect of content features on short-term video quality in the visual periphery.

In: 2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP).

  • (2016)
The area outside our central field of vision, also referred to as the visual periphery, captures most information in a visual scene, although much less sensitive than the central Fovea. Vision studies in the past have stated that there is reduced sensitivity of texture, color, motion and flicker (temporal harmonic) perception in this area, that bears an interesting application in the domain of quality perception. In this work, we particularly analyze the perceived subjective quality of videos containing H.264/AVC transmission impairments, incident at various degrees of retinal eccentricities of observers. We relate the perceived drop in quality, to five basic types of features that are important from a perceptive standpoint: texture, color, flicker, motion trajectory distortions and also the semantic importance of the underlying regions. We are able to observe that the perceived drop in quality across the visual periphery, is closely related to the Cortical Magnification fall-off characteristics of the V1 cortical region. Additionally, we see that while object importance and low frequency spatial distortions are important indicators of quality in the central foveal region, temporal flicker and color distortions are the most important determinants of quality in the periphery. We therefore conclude that, although users are more forgiving of distortions they viewed peripherally, they are nevertheless not totally blind towards it: the effects of flicker and color distortions being particularly important.
Contribution
  • A. Aldahdooh
  • Marcus Barkowsky
  • P. Callet

Spatio-temporal error concealment technique for high order multiple description coding schemes including subjective assessment.

In: 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX). pg. 1-6

  • (2016)
Error resilience (ER) is an important tool in video coding to maximize the quality of Experience (QoE). The prediction process in video coding became complex which yields an unsatisfying video quality when NALunit packets are lost in error-prone channels. There are different ER techniques and multiple description coding (MDC) is one of the promising technique for this problem. MDC is categorized into different types and, in this paper, we focus on temporal MDC techniques. In this paper, a new temporal MDC scheme is proposed. In the encoding process, the encoded descriptions contain primary frames and secondary frames (redundant representations). The secondary frames represent the MVs that are predicted from previous primary frames such that the residual signal is set to zero and is not part of the rate distortion optimization. In the decoding process of the lost frames, a weighted average error concealment (EC) strategy is proposed to conceal these frames. The proposed scheme is subjectively evaluated along with other schemes and the results show that the proposed scheme is significantly different from most of other temporal MDC schemes.
Contribution
  • A. Aldahdooh
  • E. Masala
  • G. Wallendael
  • Marcus Barkowsky

Comparing temporal behavior of fast objective video quality measures on a large-scale database.

In: 2016 Picture Coding Symposium (PCS).

  • (2016)
In many application scenarios, video quality assessment is required to be fast and reasonably accurate. The characterization of objective algorithms by subjective assessment is well established but limited due to the small number of test samples. Verification using large-scale objectively annotated databases provides a complementary solution. In this contribution, three simple but fast measures are compared regarding their agreement on a large-scale database. In contrast to subjective experiments, not only sequence-wise but also framewise agreement can be analyzed. Insight is gained into the behavior of the measures with respect to 5952 different coding configurations of High Efficiency Video Coding (HEVC). Consistency within a video sequence is analyzed as well as across video sequences. The results show that the occurrence of discrepancies depends mostly on the configured coding structure and the source content. The detailed observations stimulate questions on the combined usage of several video quality measures for encoder optimization.
Contribution
  • Y. Rai
  • Marcus Barkowsky
  • P. Le Callet

Role of spatio-temporal distortions in the visual periphery in disrupting natural attention deployment.

In: Proceedings of SPIE Conference Human Vision and Electronic Imaging 2016 (HVEI 2016).

  • (2016)

DOI: 10.2352/ISSN.2470-1173.2016.16.HVEI-117

Human visual system based quality metrics and perceptually optimized video coders often use principles of foveation and saliency to weigh the distortion in certain regions more heavily or hide the artefacts in regions where they are less noticeable. These approaches however fail to consider the impact such a tuning produces on the non-salient surroundings usually incident on the para, peri and extra-peri foveal visual regions. Vision studies on the other hand, have highlighted the enhanced sensitivity of these peripheral visual regions towards spatio-temporal artefacts: more so in the supra-threshold region. Because such analysis has often been performed using controlled synthetic stimuli and forced fixation based experimental approaches, that assume perfect luminance adaptation, tracking and semantic comprehension of underlying content, a thorough understanding of the impact of peripheral disturbances in a natural viewing scenario is missing. The present work therefore uses a Gaze Contingent Display to study the impact of spatiotemporal distortions in the peri foveal and extraperi foveal regions in a free-viewing scenario, using natural scene stimuli. Using four state of the art gaze analysis-techniques to analyze the gaze data collected from 48 observers, spatio-temporally and semantically, confirms and extends our previous understanding of distortion perception in the periphery. Our observations indicate that non-flickering spatial distortions seem to have less of a disruptive effect in the visual periphery as compared to the temporally flickering artefacts and second, the threshold at which disruptions begin to occur is higher in the visual periphery as compared to that of the fovea, both of these effects being strongly scene dependent and prone to natural scene masking. The results highlight the need for sufficient consideration of the supra-threshold effects of peripheral distortions, in order to achieve an optimum perceptual experience.
Journal article
  • A. Aldahdooh
  • E. Masala
  • G. Wallendael
  • Marcus Barkowsky

Framework for reproducible objective video quality research with case study on PSNR implementations.

In: Digital Signal Processing vol. 77 pg. 195-206

  • (2018)

DOI: 10.1016/j.dsp.2017.09.013

Reproducibility is an important and recurrent issue in objective video quality research because the presented algorithms are complex, depend on specific implementations in software packages or their parameters need to be trained on a particular, sometimes unpublished, dataset. Textual descriptions often lack the required detail and even for the simple Peak Signal to Noise Ratio (PSNR) several mutations exist for images and videos, in particular considering the choice of the peak value and the temporal pooling. This work presents results achieved through the analysis of objective video quality measures evaluated on a reproducible large scale database containing about 60,000 HEVC coded video sequences. We focus on PSNR, one of the most widespread measures, considering its two most common definitions. The sometimes largely different results achieved by applying the two definitions highlight the importance of the strict reproducibility of the research in video quality evaluation in particular. Reproducibility is also often a question of computational power and PSNR is a computationally inexpensive algorithm running faster than realtime. Complex algorithms cannot be reasonably developed and evaluated on the abovementioned 160 hours of video sequences. Therefore, techniques to select subsets of coding parameters are then introduced. Results show that an accurate selection can preserve the variety of the results seen on the large database but with much lower complexity. Finally, note that our SoftwareX accompanying paper presents the software framework which allows the full reproducibility of all the research results presented here, as well as how the same framework can be used to produce derived work for other measures or indexes proposed by other researchers which we strongly encourage for integration in our open framework.
Journal article
  • A. Aldahdooh
  • E. Masala
  • O. Janssens
  • G. Wallendael
  • Marcus Barkowsky
  • P. Callet
  • G. van Wallendael
  • P. Lambert

Improved Performance Measures for Video Quality Assessment Algorithms Using Training and Validation Sets.

In: IEEE Transactions on Multimedia vol. 74 pg. 32-41

  • 10.1109/TMM.2018.2882091 (2018)
Due to the three-dimensional spatiotemporal regularities of natural videos and small-scale video quality databases, effective objective video quality assessment (VQA) metrics are difficult to obtain but highly desirable. In this paper, we propose a general-purpose no-reference VQA framework that is based on weakly supervised learning with convolutional neural network (CNN) and resampling strategy. First, an eight-layer CNN is trained by weakly supervised learning to construct the relationship between the deformations of the three dimensional discrete cosine transform of video blocks and corresponding weak labels judged by a full-reference (FR) VQA metric. Thus, the CNN obtains the quality assessment capacity converted from the FR-VQA metric, and the effective features of the distorted videos can be extracted through the trained network. Then, we map the frequency histogram calculated from the quality score vectors predicted by the trained network onto the perceptual quality. Specially, to improve the performance of the mapping function, we transfer the frequency histogram of the distorted images and videos to resample the training set. The experiments are carried out on several widely used video quality assessment databases. The experimental results demonstrate that the proposed method is on a par with some state-of-the-art VQA metrics and has promising robustness.
Journal article
  • Andreas Gegenfurtner
  • Armin Eichinger
  • Richard Latzel
  • Marc-Philipp Dietrich
  • Marcus Barkowsky
  • Alexandra Glufke
  • Angelika Stadler
  • Wolfgang Stern

Mobiles Eye-Tracking in den angewandten Wissenschaften.

In: Bavarian Journal of Applied Sciences vol. 4 pg. 370-395

  • (2018)

DOI: 10.25929/bjas.v4i1.54

Mobiles Eye-Tracking ist als Forschungsmethode beliebter denn je und gewinnt in unterschiedlichen Feldern der angewandten Wissenschaften mehr und mehr an Bedeutung. Dieser Beitrag diskutiert, wie die Aufzeichnung und Analyse von Blickbewegungen in der Mobilität, im Usability Engineering, den Sportwissenschaften, der Augmented Reality/Mixed Reality/Virtual Reality und der Medizin bzw. medizinischen Weiterbildung eingesetzt wird. Der Beitrag gliedert sich dabei in drei Teile: in einem ersten Teil werden Grundzüge des Eye-Trackings erläutert; in einem zweiten Teil wird der Einsatz mobilen Eye-Trackings in ausgewählten Feldern der angewandten Wissenschaften veranschaulicht; und in einem abschließenden dritten Teil werden Potentiale und Risiken sowie zukünftige Forschungslinien skizziert, um die Anwendung mobilen Eye-Trackings als digitale Forschungsmethode weiter zu etablieren.
Journal article
  • K. Brunnström
  • Marcus Barkowsky

Statistical quality of experience analysis: on planning the sample size and statistical significance testing.

In: Journal of Electronic Imaging vol. 27 pg. 053013

  • (2018)

DOI: 10.1117/1.JEI.27.5.053013

This paper analyzes how an experimenter can balance errors in subjective video quality tests between the statistical power of finding an effect if it is there and not claiming that an effect is there if the effect is not there, i.e., balancing Type I and Type II errors. The risk of committing Type I errors increases with the number of comparisons that are performed in statistical tests. We will show that when controlling for this and at the same time keeping the power of the experiment at a reasonably high level, it is unlikely that the number of test subjects that are normally used and recommended by the International Telecommunication Union (ITU), i.e., 15 is sufficient but the number used by the Video Quality Experts Group (VQEG), i.e., 24 is more likely to be sufficient. Examples will also be given for the influence of Type I error on the statistical significance of comparing objective metrics by correlation. We also present a comparison between parametric and nonparametric statistics. The comparison targets the question whether we would reach different conclusions on the statistical difference between the video quality ratings of different video clips in a subjective test, based on the comparison between the student T-test and the Mann‐Whitney U-test. We found that there was hardly a difference when few comparisons are compensated for, i.e., then almost the same conclusions are reached. When the number of comparisons is increased, then larger and larger differences between the two methods are revealed. In these cases, the parametric T-test gives clearly more significant cases, than the nonparametric test, which makes it more important to investigate whether the assumptions are met for performing a certain test.
Journal article
  • J. Li
  • Wang, J.
  • Marcus Barkowsky
  • P. Callet

Exploring the effects of subjective methodology on assessing visual discomfort in immersive multimedia.

In: Electronic Imaging, Human Vision and Electronic Imaging

  • (2018)

DOI: 10.2352/ISSN.2470-1173.2018.14.HVEI-527

Visual discomfort is an important factor that influences viewing experience in immersive multimedia, for example, 3DTV and VR. With the added value of depth, the novel perceptual experience, visual discomfort is not an easy task for observers to evaluate. In this study, we investigate how the subjective methodology affects the test results in 3DTV condition. Two subjective visual discomfort experiments were conducted. One used the Pair Comparison (PC) method and the other used the Absolute-Category Rating (ACR) method. The results demonstrated that PC method had more powerful discriminability. For a difficult perceptualrelated tasks, such as visual discomfort in our study, PC was more easy to understand and conduct for the observers which led to reliable results. It also showed some very important but usually ignored conclusions on the subjective experiment, i.e., for measuring the perceived visual discomfort, the observer\textquoterights judgment behavior might be affected by the test methodology.
Journal article
  • A. Aldahdooh
  • Marcus Barkowsky
  • P. Le Callet

Proof-of-concept: role of generic content characteristics in optimizing video encoders.

In: Multimedia Tools and Applications vol. 77 pg. 16069-16097

  • (2018)

DOI: 10.1007/s11042-017-5180-1

The influence of content characteristics on the efficiency of redundancy and irrelevance reduction in video coding is well known. Each new standard in video coding includes additional coding tools that potentially increase the complexity of the encoding process in order to gain further rate-distortion efficiency. In order to be versatile, encoder implementations often neglect the content dependency or they optimize the encoding complexity on a local scale, i.e. on a single frame or on the coding unit level without being aware of the global content type. In this contribution, an analysis is presented which coding tool settings of the recent High Efficiency Video Coding (HEVC) standard are most efficient for a given content type when balancing rate-distortion against computational complexity measured in encoding time. The content type is algorithmically determined, leading to a framework for rate-distortion-complexity based encoder parameter decision for any given video sequence. The implementability is demonstrated using a set of 35 Ultra-HD (UHD) sequences. The performance results and evaluations show that the encoding parameters may be predicted to optimize the video coding. For instance, predicting motion search range achieves complexity reduction of 36% on average when HEVC reference HM is used at a cost of bitrate (2%). When another HEVC coding standard software, x265, is used to predict the coding unit (CU) size, there is a reduction of 20% in bitrate and of 8% in distortion but there is a reduction of 6% in execution time.
Journal article
  • A. Aldahdooh
  • E. Masala
  • G. Wallendael
  • Marcus Barkowsky

Reproducible research framework for objective video quality measures using a large-scale database approach.

In: SoftwareX vol. 8 pg. 64-68

  • (2018)

DOI: 10.1016/j.softx.2017.09.004

This work presents a framework to facilitate reproducibility of research in video quality evaluation. Its initial version is built around the JEG-Hybrid database of HEVC coded video sequences. The framework is modular, organized in the form of pipelined activities, which range from the tools needed to generate the whole database from reference signals up to the analysis of the video quality measures already present in the database. Researchers can re-run, modify and extend any module, starting from any point in the pipeline, while always achieving perfect reproducibility of the results. The modularity of the structure allows to work on subsets of the database since for some analysis this might be too computationally intensive. To this purpose, the framework also includes a software module to compute interesting subsets, in terms of coding conditions, of the whole database. An example shows how the framework can be used to investigate how the small differences in the definition of the widespread PSNR metric can yield very different results, discussed in more details in our accompanying research paper Aldahdooh et al. (0000). This further underlines the importance of reproducibility to allow comparing different research work with high confidence. To the best of our knowledge, this framework is the first attempt to bring exact reproducibility end-to-end in the context of video quality evaluation research.
Contribution
  • Katharina Heydn
  • Marc-Philipp Dietrich
  • Marcus Barkowsky
  • Götz Winterfeldt
  • S. Mammen
  • A. Nüchter

The Golden Bullet: A Comparative Study for Target Acquisition, Pointing and Shooting.

In: Proceedings of the 2019 11th International Conference on Virtual Worlds and Games for Serious Applications (VS-Games) [4-6 Sept. 2019; Vienna, Austria].

  • Eds.:
  • Institute of Electrical and Electronics Engineers Inc.

  • (2019)

DOI: 10.1109/VS-Games.2019.8864589

In this study, we evaluate an interaction sequence performed by six modalities consisting of desktop-based (DB) and virtual reality (VR) environments using different input devices. For the given study, we implemented a vertical prototype of a first person shooter (FPS) game scenario, focusing on the genre-defining point-and-shoot mechanic. We introduce measures to evaluate the success of the according interaction sequence (times for target acquisition, pointing, shooting, overall net time, and number of shots) and conduct experiments to record and compare the users' performances. We show that interacting using head-tracking for landscape-rotation is performing similarly to the input of a screen-centered mouse and also yielded shortest times in target acquisition and pointing. Although using head-tracking for target acquisition and pointing was most efficient, subjects rated the modality using head-tracking for target acquisition and a 3DOF Controller for pointing best. Eye-tracking (ET) yields promising results, but calibration issues need to be resolved to enhance reliability and overall user experience.
Journal article
  • T. Mizdos
  • Marcus Barkowsky
  • M. Uhrina
  • P. Pocta

Linking Bitstream Information to QoE: A Study on Still Images Using HEVC Intra Coding.

In: Advances in Electrical and Electronic Engineering (AEEE) vol. 17 pg. 436-445

  • (2019)

DOI: 10.15598/aeee.v17i4.3625

The coding tools used in image and video encoders aim at high perceptual quality for low bitrates. Analyzing the results of the encoders in terms of quantization parameter, image partitioning, prediction modes or residuals may provide important insight into the link between those tools and the human perception. As a first step, this contribution analyzes the possibility to transcode reference images of three well-known image databases, i.e. IRCCyN/IVC, LIVE and TID2013, from their original, older formats to HEVC; thus creating a homogeneous database of 327 HEVC encoded images accompanied with bitstream parameters and values obtained from objective and subjective assessments. Secondly, it analyzes some of the HEVC intra coding parameters regarding their influence on the image quality by using machine learning, namely Support Vector Machine - Regression.
Contribution
  • L. Tiotsop
  • E. Masala
  • A. Aldahdooh
  • G. Wallendael
  • Marcus Barkowsky

Computing Quality-of-Experience Ranges for Video Quality Estimation.

In: Proceedings of the 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX) [5-7 June 2019; Berlin].

  • Eds.:
  • Institute of Electrical and Electronics Engineers Inc.

  • (2019)

DOI: 10.1109/QoMEX.2019.8743303

Typically, the measurement of the Quality of Experience for video sequences aims at a single value, in most cases the Mean Opinion Score (MOS). Predicting this value using various algorithms has been widely studied. However, deviation from the MOS is often handled as an unpredictable error. The approach in this contribution estimates intervals of video quality instead of the single valued MOS. Well-known video quality estimators are fused together to output a lower and upper border for the expected video quality, on the basis of a model derived from a well-known subjectively annotated dataset. Results on different datasets provide insight on the suitability of the well-known estimators for this particular approach.
Journal article
  • A. Aldahdooh
  • E. Masala
  • G. Wallendael
  • P. Lambert
  • Marcus Barkowsky

Improving relevant subjective testing for validation: Comparing machine learning algorithms for finding similarities in VQA datasets using objective measures.

In: Signal Processing: Image Communication vol. 74 pg. 32-41

  • (2019)

DOI: 10.1016/j.image.2019.01.004

Subjective quality assessment is a necessary activity to validate objective measures or to assess the performance of innovative video processing technologies. However, designing and performing comprehensive tests requires expertise and a large effort especially for the execution part. In this work we propose a methodology that, given a set of processed video sequences prepared by video quality experts, attempts to reduce the number of subjective tests by selecting a subset with minimum size which is expected to yield the same conclusions of the larger set. To this aim, we combine information coming from different types of objective quality metrics with clustering and machine learning algorithms that perform the actual selection, therefore reducing the required subjective assessment effort while trying to preserve the variety of content and conditions needed to ensure the validity of the conclusions. Experiments are conducted on one of the largest publicly available subjectively annotated video sequence dataset. As performance criterion, we chose the validation criteria for video quality measurement algorithms established by the International Telecommunication Union.
Contribution
  • L. Tiotsop
  • T. Mizdos
  • M. Uhrina
  • P. Pocta
  • Marcus Barkowsky
  • E. Masala

Predicting Single Observers Votes from Objective Measures using Neural Networks.

In: Proceedings of the IS&T International Symposium on Human Vision and Electronic Imaging 2020 (HVEI2020).

  • (2020)
Contribution
  • L. Tiotsop
  • T. Mizdos
  • E. Masala
  • Marcus Barkowsky
  • P. Pocta

How to Train No Reference Video Quality Measures for New Coding Standards using Existing Annotated Datasets?.

In: 2021 IEEE 23rd International Workshop on Multimedia Signal Processing (MMSP). pg. 1-6

  • Eds.:
  • Institute of Electrical and Electronics Engineers Inc.

IEEE

  • (2021)

DOI: 10.1109/MMSP53017.2021.9733456

Subjective experiments are important for developing objective Video Quality Measures (VQMs). However, they are time-consuming and resource-demanding. In this context, being able to reuse existing subjective data on previous video coding standards to train models capable of predicting the perceptual quality of video content processed with newer codecs acquires significant importance. This paper investigates the possibility of generating an HEVC encoded Processed Video Sequence (PVS) in such a way that its perceptual quality is as similar as possible to that of an AVC encoded PVS whose quality has already been assessed by human subjects. In this way, the perceptual quality of the newly generated HEVC encoded PVS may be annotated approximately with the Mean Opinion Score (MOS) of the related AVC encoded PVS. To show the effectiveness of our approach, we compared the performance of a simple and low complexity but yet effective no reference hybrid model trained on the data generated with our approach with the same model trained on data collected in the context of a pristine subjective experiment. In addition, we merged seven subjective experiments such that they can be used as one aligned dataset containing either original HEVC bitstreams or the newly generated data explained in our proposed approach. The merging process accounts for the differences in terms of quality scale, chosen assessment method and context influence factors. This yields a large annotated dataset of HEVC sequences that is made publicly available for the design and training of no reference hybrid VQMs for HEVC encoded content.
Journal article
  • T. Mizdos
  • Marcus Barkowsky
  • M. Uhrina
  • P. Pocta

How to reuse existing annotated image quality datasets to enlarge available training data with new distortion types.

In: Multimedia Tools and Applications vol. 80 pg. 28137–28159

  • (2021)

DOI: 10.1007/s11042-021-10679-5

There is a continuing demand for objective measures that predict perceived media quality. Researchers are developing new methods for mapping technical parameters of digital media to the perceived quality. It is quite common to use machine learning algorithms for these purposes especially deep learning algorithms, which need large amounts of data for training. In this paper, we aim towards getting more training data with recent types of distortions. Instead of doing expensive subjective experiments, we evaluate the reuse of previously published, well-known image datasets with subjective annotation. In this contribution, the procedure of mapping Mean Opinion Scores (MOS) from an already published subjectively annotated dataset with older codecs to new codecs is presented. In particular, we map from Joint Photographic Experts Group (JPEG) distortions to newer High Efficiency Video Coding (HEVC) distortions. We have used values of three different objective methods as a connection between these two different distortion types. In order to investigate the significance of our approach, subjective verification tests were designed and conducted. The design goals led to two types of experiments, i.e. Pair Comparison (PC) test and Absolute Category Rating (ACR) test, in which 40 participants provided their opinion. Results of the subjective experiments indicate that it may be possible to use information gained from older datasets to describe the perceived quality of more recent compression algorithms.
Journal article
  • L. Tiotsop
  • T. Mizdos
  • M. Uhrina
  • Marcus Barkowsky
  • P. Pocta
  • E. Masala

Modeling and estimating the subjects’ diversity of opinions in video quality assessment: a neural network based approach.

In: Multimedia Tools and Applications vol. 80 pg. 3469-3487

  • (2021)

DOI: 10.1007/s11042-020-09704-w

Subjective experiments are considered the most reliable way to assess the perceived visual quality. However, observers’ opinions are characterized by large diversity: in fact, even the same observer is often not able to exactly repeat his first opinion when rating again a given stimulus. This makes the Mean Opinion Score (MOS) alone, in many cases, not sufficient to get accurate information about the perceived visual quality. To this aim, it is important to have a measure characterizing to what extent the observed or predicted MOS value is reliable and stable. For instance, the Standard deviation of the Opinions of the Subjects (SOS) could be considered as a measure of reliability when evaluating the quality subjectively. However, we are not aware of the existence of models or algorithms that allow to objectively predict how much diversity would be observed in subjects’ opinions in terms of SOS. In this work we observe, on the basis of a statistical analysis made on several subjective experiments, that the disagreement between the quality as measured by means of different objective video quality metrics (VQMs) can provide information on the diversity of the observers’ ratings on a given processed video sequence (PVS). In light of this observation we: i) propose and validate a model for the SOS observed in a subjective experiment; ii) design and train Neural Networks (NNs) that predict the average diversity that would be observed among the subjects’ ratings for a PVS starting from a set of VQMs values computed on such a PVS; iii) give insights into how the same NN based approach can be used to identify potential anomalies in the data collected in subjective experiments.
Journal article
  • L. Tiotsop
  • F. Agboma
  • G. van Wallendael
  • A. Aldahdooh
  • S. Bosse
  • L. Janowski
  • Marcus Barkowsky
  • E. Masala

On the Link Between Subjective Score Prediction and Disagreement of Video Quality Metrics.

In: IEEE Access vol. 9 pg. 152923-152937

  • (2021)

DOI: 10.1109/ACCESS.2021.3127395

Several video quality metrics (VQMs) have been proposed in many publications to predict how humans perceive video quality. It is common to observe significant disagreements amongst the quality predictions of these VQMs for the same video sequence. Following an extensive literature search, we found no publicised work that has investigated if such disagreements convey useful information on the accuracy of VQMs. Herein, a measure for quantifying the disagreement between VQMs is proposed. A small-scale subjective study is carried out to assess the effectiveness of our proposal. In particular, the proposed disagreement measure is shown to be extremely effective in determining whether the quality of any given processed video sequence (PVS) can be accurately predicted by the VQMs. This type of information is particularly useful for identifying video sequences that are likely to degrade the end-user’s quality of experience (QoE). Our proposal is also useful in selecting the most effective PVSs to be employed in a subjective test. We show that the proposed disagreement measure can be effectively predicted from bitstream features. This establishes a link between the capability to accurately assess the quality of a PVS and the way it is encoded. In addition, an analysis is conducted to compare the performances of some well-known and widely used open-source metrics and two proprietary metrics. The two proprietary metrics are used by a large media company for enhancing its delivery pipeline. The outcome of this comparison highlights the suitability of the open-source VQM, Video Multi-method Assessment Fusion (VMAF), as a good benchmark quality measure for both the industrial and academic environments.
Journal article
  • L. Tiotsop
  • T. Mizdos
  • Marcus Barkowsky
  • P. Pocta
  • A. Servetti
  • E. Masala

Mimicking Individual Media Quality Perception with Neural Network based Artificial Observers.

In: ACM Transactions on Multimedia Computing, Communications, and Applications vol. 18 pg. 1-25

  • (2022)

DOI: 10.1145/3464393

The media quality assessment research community has traditionally been focusing on developing objective algorithms to predict the result of a typical subjective experiment in terms of Mean Opinion Score (MOS) value. However, the MOS, being a single value, is insufficient to model the complexity and diversity of human opinions encountered in an actual subjective experiment. In this work we propose a complementary approach for objective media quality assessment that attempts to more closely model what happens in a subjective experiment in terms of single observers and, at the same time, we perform a qualitative analysis of the proposed approach while highlighting its suitability. More precisely, we propose to model, using neural networks (NNs), the way single observers perceive media quality. Once trained, these NNs, one for each observer, are expected to mimic the corresponding observer in terms of quality perception. Then, similarly to a subjective experiment, such NNs can be used to simulate the users’ single opinions, which can be later aggregated by means of different statistical indicators such as average, standard deviation, quantiles, etc. Unlike previous approaches that consider subjective experiments as a black box providing reliable ground truth data for training, the proposed approach is able to consider human factors by analyzing and weighting individual observers. Such a model may therefore implicitly account for users’ expectations and tendencies, that have been shown in many studies to significantly correlate with visual quality perception. Furthermore, our proposal also introduces and investigates an index measuring how much inconsistency there would be if an observer was asked to rate many times the same stimulus. Simulation experiments conducted on several datasets demonstrate that the proposed approach can be effectively implemented in practice and thus yielding a more complete objective assessment of end users’ quality of experience.
Journal article
  • T. Mizdos
  • Marcus Barkowsky
  • P. Pocta
  • M. Uhrina

30 Years of Video Coding Evolution - What Can We Learn from it in Terms of QoE?.

In: Advances in Electrical and Electronic Engineering (AEEE) vol. 20

  • (2022)

DOI: 10.15598/aeee.v20i3.3998

From the beginnings of ITU-T H.261 to H.265 (HEVC), each new video coding standard has aimed at halving the bitrate at the same perceptual quality by redundancy and irrelevancy reduction. Each improvement has been explained by comparably small changes in the video coding toolset. This contribution aims at starting the Quality of Experience (QoE) analysis of the accumulated improvements over the last thirty years. Based on an overview of the changes in the coding tools, we analyze the changes in the quantized residual information. Visual comparison and statistical measures are performed and some interpretations are provided towards explaining how irrelevancy reduction may have led to such a huge reduction in bitrate. The interpretation of the results in terms of QoE paves the way towards an understanding of the coding tools in terms of visual quality. It may help in understanding how the irrelevancy reduction has been improved over the decades. Understanding how the differences of the residuals relate to known or yet unknown properties of the human visual system, may enable a closer collaboration between perception research and video compression research.
Contribution
  • L. Tiotsop
  • A. Servetti
  • Marcus Barkowsky
  • E. Masala

Regularized Maximum Likelihood Estimation of the Subjective Quality from Noisy Individual Ratings.

In: 2022 14th International Conference on Quality of Multimedia Experience (QoMEX). pg. 1-4

Institute of Electrical and Electronics Engineers, Inc.

  • (2022)

DOI: 10.1109/QoMEX55416.2022.9900903

Despite several approaches to recover the ground truth subjective quality score from noisy individual ratings in subjective experiments have been explored in the literature, there is still room for improvement, in particular in terms of robustness to noise. This paper proposes a new approach that combines the traditional maximum likelihood estimation framework with a newly proposed regularization term, based on information theory concepts, that is meant to underweight surprising ratings of the quality of a given stimulus, looked at as a noise manifestation, in the final analytical expression of the recovered subjective quality. Computational experiments show the higher robustness to noise of our proposal when compared to three state-of-the-art methods.
Contribution
  • P. Majer
  • L. Tiotsop
  • Marcus Barkowsky

Training the DNN of a Single Observer by Conducting Individualized Subjective Experiments.

In: 2023 15th International Conference on Quality of Multimedia Experience (QoMEX). pg. 103-106

IEEE

  • (2023)

DOI: 10.1109/QoMEX58391.2023.10178608

Predicting the quality perception of an individual subject instead of the mean opinion score is a new and very promising research direction. Deep Neural Networks (DNNs) are suitable for such prediction but the training process is particularly data demanding due to the noisy nature of individual opinion scores. We propose a human-in-the-loop training process using multiple cycles of a human voting, DNN training, and inference procedure. Thus, opinion scores on individualized sets of images were progressively collected from each observer to refine the performance of their DNN. The results of computational experiments demonstrate the effectiveness of our approach. For future research and benchmarking, five DNNs trained to mimic five observers are released together with a dataset containing the 1500 opinion scores progressively gathered from each of these observers during our training cycles.
Journal article
  • L. Tiotsop
  • A. Servetti
  • Marcus Barkowsky
  • P. Pocta
  • T. Mizdos
  • G. van Wallendael
  • E. Masala

Predicting individual quality ratings of compressed images through deep CNNs-based artificial observers.

In: Signal Processing: Image Communication vol. 112 pg. 116917

  • (2023)

DOI: 10.1016/j.image.2022.116917

Unlike traditional objective approaches aimed at MOS prediction, subjective experiments provide individual opinion scores that allow, for instance, to estimate the distribution of users’ opinion scores. Unfortunately, the current literature is lacking objective quality assessment approaches that simulate the process of a subjective test. Therefore, this work focuses on modeling an individual subject through a deep CNN that, once trained, is expected to mimic the subject in terms of quality perception; for this reason, we call it “Artificial Intelligence-based Observer” (AIO). Several AIOs, modeling subjects with different characteristics, can be derived and used to simulate the process of a subjective test, thus yielding a more complete objective quality assessment. However, the training of the AIOs is hindered by two major issues: (i) the lack of training sets containing a large number of individual opinion scores; (ii) the noisy nature of individual opinion scores used as ground truth. To overcome these issues, we motivate a two-step learning approach. During the first learning step, the architecture of the well-known ResNet50 is appropriately modified and its initial weights are updated using a large scale synthetically annotated dataset of JPEG compressed images created for quality assessment purpose. This yields a new deep CNN called JPEGResNet50 that can accurately evaluate the quality of JPEG compressed images. The second learning step, conducted on a subjectively annotated dataset, refines the generic perceptual quality features already learned by the JPEGResNet50 to derive the AIO of each subject. Extensive computational experiments show the potential and effectiveness of our approach.
Journal article
  • L. Tiotsop
  • A. Servetti
  • P. Pocta
  • G. van Wallendael
  • Marcus Barkowsky
  • E. Masala

Multiple Image Distortion DNN Modeling Individual Subject Quality Assessment.

In: ACM Transactions on Multimedia Computing, Communications, and Applications vol. 20 pg. 1-27

  • (2024)

DOI: 10.1145/3664198

A recent research direction is focused on training Deep Neural Networks (DNNs) to replicate individual subject assessments of media quality. These DNNs are referred to as Artificial Intelligence-based Observers (AIOs). An AIO is designed to simulate, in real-time, the quality ratings of a specific individual, enabling an automatic quality assessment that accounts for subjects characteristics and preferences. Training AIOs is a promising but challenging research area due to the greater noise in individual raw opinion scores compared to the Mean Opinion Score. Effective learning from noisy labels necessitates the training of complex models on large-scale datasets. Unfortunately, this is challenging for AIOs as the media quality assessment community lacks extensive datasets that include individual opinion scores. To address the complexity of the task, we first created a dataset comprising two million samples, with synthetic labels derived from human annotation. We then trained a customized network for image quality assessment, named Multi-Distortion ResNet50 (MDResNet50), on this dataset. The weights of the MDResNet50 were subsequently utilized to initialize the learning process of each AIO, thereby avoiding the need to train a complex model from scratch on a small-scale dataset with raw individual opinion scores. Computational experiments show that our approach significantly advances the state-of-the-art in the AIO research. In particular: (i) we demonstrate through a simulation the ability of AIOs to mimic two well-known behavioral characteristics of a subject, i.e., bias and inconsistency, when scoring the media quality; (ii) we train and release DNN-based AIOs that, compared to the state-of-the-art, exhibit a higher performance with a statistical significance in assessing multiple image distortions; (iii) we train AIOs that more accurately mimic the sensitivity of real subjects to noise and color saturation and also better predict the opinion score distribution compared to the state-of-the-art AIOs.
Journal article
  • L. Tiotsop
  • A. Servetti
  • Marcus Barkowsky
  • E. Masala

Modeling Subject Scoring Behaviors in Subjective Experiments Based on a Discrete Quality Scale.

In: IEEE Transactions on Multimedia vol. 26 pg. 8742-8757

  • (2024)

DOI: 10.1109/TMM.2024.3382483

Several approaches have been proposed to estimate quality in subjective experiments while highlighting peculiar subject behaviors. However, there is some room for improvement in existing approaches, both in terms of robustness to noise and the ability to accurately indicate several peculiar subject behaviors in subjective experiments. This work advances the state-of-the-art in three main directions: i) A new approach to estimate the subjective quality from noisy ratings is proposed and is shown to be more robust to noise than are four state-of-the-art approaches; ii) a novel subject scoring model is proposed that makes it possible to highlight several peculiar behaviors typically observed in subjective experiments; and iii) our proposed probabilistic subject scoring model results from the proof of a theorem, whereas in previous approaches a probabilistic scoring model is assumed a priori. This represents an important first step toward models supported by a stronger theoretical foundation. Numerical experiments conducted on several datasets highlight the effectiveness of our proposal.